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PREFACE 


In my book [1],* which is devoted to the foundations of classical statis- 
tical mechanics, I point out (page 7) that the same method may, in prin- 
ciple, be applied to the construction of the mathematical foundations of 
quantum statistics. However, since all aspects of this method must undergo 
certain changes in form, I decided to write a special monograph devoted 
to the foundations of quantum statistics. This plan took almost ten years 
to complete, partly because of the burden of other work, and partly because 
of the difficulty in applying the method: Inclusion of the “new” statistics 
(symmetric and antisymmetric) required a more serious modification of the 
method than I had originally thought necessary. 

Despite more or less significant alterations of a technical nature, the 
central idea of the method remains unchanged. In the area of quantum 
statistics, I show that a rigorous and systematic mathematical basis of the 
computational formulas of statistical physics does not require a special 
unwieldy analytical apparatus (the method of Darwin-Fowler), but may 
be obtained from an elementary application of the well-developed limit 
theorems of the theory of probability. Apart from its purely scientific value, 
which is evident and requires no comment, the possibility of such an appli- 
cation is particularly satisfying to Soviet scientists, since the study of these 
limit theorems was founded by P. L. Chebyshev and was developed fur- 
ther by other Russian and Soviet mathematicians. The fact that these 
theorems can form the analytical basis for all the computational formulas 
of statistical physics once again demonstrates their value for applications. 

This monograph, like my first book, is devoted entirely to the mathe- 
matical method of the theory and is in no way a complete physical treatise. 
In fact, no concrete physical problem is considered. The book is directed 
primarily towards the mathematical reader. However, I hope that the 
physicist who is concerned with the mathematical apparatus of his science 
will find something in it to interest him. 


29 August 1950 A. KHINCHIN 


* Numbers in square brackets refer to the references listed at the end of the book. 


xi 


INTRODUCTION 


§1. The most important characteristics of the mathematical 
apparatus of quantum statistics 


The transition from classical to quantum mechanics involves a basic 
change in the fundamental ideas and concepts of this science. It is there- 
fore not surprising that the mathematical apparatus of statistical mechan- 
ics should undergo a significant change in the transition to the concepts of 
quantum physics. In most cases this change is expressed in a generalization 
or a refinement of the mathematics, but sometimes the introduction of 
essentially new mathematical ideas is required. We begin with the enumer- 
ation of those new concepts of quantum physics which have the greatest 
effect on its statistical apparatus. 

First, we recall two facts which significantly change the external appear- 
ance of the mathematical apparatus of statistical physics yet do not have 
a profound effect on its content: 1) Some physical quantities have discrete 
spectra (denumerable sets of possible values). This fact has only a super- 
ficial effect on the mathematical apparatus. It merely requires that finite 
sums or infinite series be used in place of the usual integrals of classical 
mechanics. 2) Physical quantities, in addition to depending on the usual 
Hamiltonian variables of classical mechanics, depend on “spin” variables 
which are specific to quantum physics and have no analog in the classical 
theory. This fact also causes no change in the basic ideas of statistical 
physics, but only complicates the calculations slightly in certain cases. In 
order not to obscure the fundamental concepts of the theory with details 
which are not of primary importance for the mathematical method, we 
avoid mentioning spin wherever possible in this book. 

A new aspect of quantum mechanics, which is not present in classical 
physics, is the statistical nature of its assertions. In classical mechanics, the 
state of a system uniquely determines the values of all the physical quan- 
tities associated with it. Since every such quantity is a function of the 
Hamiltonian variables, specifying the values of the latter is equivalent to 
specifying the state of the system. In quantum mechanics, the state of a 
system defines the physical quantities only as random variables, i.e., it de- 
termines the laws of distribution obeyed by the physical quantities and not 
their values. This essentially statistical feature of quantum mechanics is 
independent of and distinct from the statistical aspects of the special meth- 
ods of statistical physics. In statistical physics the mean value of a physical 
quantity is found by averaging the quantity over different states of the sys- 
tem. In quantum mechanics, however, we speak of the mean value of a 
quantity in a certain definite fixed state. Therefore, quantum statistics, as 
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distinct from the classical theory, is a statistical theory in a double sense 
of the word. It is very important to distinguish carefully between the con- 
cepts and computational methods of quantum mechanics on the one hand 
and those of statistical physics on the other. We introduce a special ter- 
minology and system of notation for each of them, and we rigorously avoid 
confusing the two sets of ideas since they effectively have nothing in com- 
mon, except that both are statistical in nature. 

This double statistical character of quantum statistics has a somewhat 
greater effect on the mathematical apparatus than the two facts mentioned 
above. However, even here the necessary changes do not affect the basis 
of the method. The intrinsic statistical nature of quantum mechanics, 
because it is completely independent of the special methods of statistical 
physics, does not cause any change in the essence of these methods. Only 
a new superstructure is required, and this requirement merely changes the 
appearance of the final result. 

Finally, we must consider in detail two new features of quantum me- 
chanics which have a much more profound effect on the apparatus of statis- 
tical physics. In some applications these features require a qualitative 
change in the apparatus. 

The first of these features involves the so-called “new” statistical schemes 
(Bose-Einstein and Fermi-Dirac) which do not and cannot have analogs 
in classical statistical mechanics. In principle, the situation just alluded to 
is also possible in the classical theory: It is only a question of the necessity 
of forming mean values of physical quantities by averaging over some 
(small) fraction of the number of states of the system which have a given 
total energy. In the classical theory such a reduction of the averaging mani- 
fold becomes necessary whenever the equations of motion have a single- 
valued integral which is independent of the energy integral (see [1; §10, p. 
47]). However, such a necessity rarely arises in practice, since under ordi- 
nary conditions integrals of this kind either do not occur at all; or, if they 
do occur, the averages over the reduced manifold prove to be practically the 
same as the averages over the original complete manifold. 

The transition to the “new” statistics signifies just such a reduction of 
the manifold of “accessible” states of the system over which the averaging 
must be carried out. The reduction is necessary because of the existence 
of a certain single-valued integral of Schrodinger’s equation. (This quantum- 
mechanical equation describes the evolution in time of the state of a sys- 
tem and thus replaces the “equations of motion” of classical mechanics.) 
The existence of this integral (we call it ‘‘the index of symmetry” in the 
following) is the rule, not the exception. Also, the mean values obtained 
by averaging over the reduced manifold differ from those obtained by aver- 
aging over the original complete manifold to such a degree that it is abso- 
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lutely necessary to calculate these differences. In classical mechanics the 
equations of motion cannot have integrals which are in any way analogous 
to this “index of symmetry”. This index describes a specific feature of 
quantum mechanics. 

In constructing a general statistical theory, this necessity to reduce the 
averaging manifold leads to an essential complication of the mathematical 
apparatus. The local limit theorems of the theory of probability remain the 
fundamental basis as before. However, even in the simplest case of systems 
consisting of particles of a single type, which obey symmetric or antisym- 
metric statistics, it is convenient to use two-dimensional limit theorems 
instead of a one-dimensional theorem. The use of a one-dimensional limit 
theorem only suffices for the case of complete statistics (i.e., for the case of 
the classical ‘‘“Maxwell-Boltzmann scheme”). The reduction of the com- 
putational problems of statistical physics to those of establishing limit 
theorems of the theory of probability also undergoes significant changes. 
In addition, the need to carry out all computations on an extremely gen- 
eral basis, which simultaneously includes all three basic statistical schemes, 
naturally makes the exposition more complicated. 

The second specific feature of quantum mechanics which exerts a sub- 
stantial influence on the methods of statistical physics involves the prob- 
lem of the “suitability” of the mean values given by these methods; that 
is, the question of whether these mean values can be verified by experiment. 
(This will be the case if the dispersions are small.) To answer this question, 
it is customary in classical statistical mechanics to formulate so-called 
ergodic hypotheses or theorems. These state that, on the average, a sys- 
tem, whose evolution in time is governed by the equations of motion, re- 
mains in different parts of a given manifold of constant energy for fractions 
of the total time interval which are proportional to the volumes of these 
parts. Therefore, if we observe any physical quantity associated with a 
given system over a definite time interval, the arithmetic average of the 
results of a sufficiently large number of measurements will, as a rule, be close 
to the (theoretical) statistical average. It is well-known that in classical 
statistical mechanics no attempt at such an “ergodic” approach to estab- 
lishing the suitability of theoretical averages has yet led to any completely 
satisfactory solution (despite a series of remarkable isolated successes). 
However, in quantum statistical mechanics such an “ergodic” approach 
turns out to be impossible in principle. A classical mechanical system 
changes its state according to the equations of motion and during the course 
of time its state, at least in principle, can approach as closely as desired to 
any previously specified state which has the same total energy. This state- 
ment is used as a basis for the attempt to compare theoretical averages of 
physical quantities, taken over all possible states of an isolated system, 
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with data obtained from measurements of the corresponding quantities of 
the same system made at different times in its evolution. In quantum me- 
chanics the situation is completely different. If a system has a definite 
(fixed) total energy (i.c., if the system is in a “‘stationary” state) and evolves 
according to Schrédinger’s equation, then the distribution law of any phys- 
ical quantity associated with the system remains invariant in time. (We 
prove this in Chapter II, §5.) But in quantum mechanics the state of a 
system determines only the distribution laws obeyed by the physical quan- 
tities associated with the system. We must therefore suppose that the 
state of a system, which has a definite total energy, does not in general 
change with time. Hence, the average of a sequence of measurements per- 
formed on such a system (even if a sequence of this kind were possible 
without radical disruption of the state of the system by each individual 
measurement) should yield a result which has nothing in common with the 
theoretical statistical average, since the latter is obtained by averaging the 
quantity over all states which have the same total energy as the given 
system. 

Thus, regardless of how we appraise the effectiveness of ergodic methods 
in classical statistical mechanics, in quantum statistics they are in principle 
of no value in establishing the suitability of the theoretical mean values of 
physical quantities. (See [2].) The “time averages” of such quantities, in 
virtue of the above discussion, will, as a rule, be quite different from the 
theoretical mean values. Therefore, in choosing a mathematical apparatus 
in quantum statistics we must consider the need to find other methods for 
establishing the suitability of mean values. As we shall see, this requires a 
very accurate estimate of the remainder terms in the relevant limit theo- 
rems of the theory of probability. In particular, the accuracy must be sig- 
nificantly improved compared to that required for estimating mean values. 

We wish to emphasize once again that despite the various changes neces- 
sary in the mathematical apparatus the central idea of our methods remains 
unchanged in the transition from classical to quantum physics. This idea 
consists in the systematic application of the asymptotic formulas of the 
theory of probability to all the calculations of statistical physics. These 
formulas represent a general study of mass phenomena, and provide a 
rigorous mathematical foundation for statistical physics. Therefore, the 
creation of a special analytical apparatus is unnecessary. 


§2. Contents of the book 


We mentioned in the Preface that this book is intended for two categories 
of readers: physicists interested in the mathematical foundations of their 
science and mathematicians who wish to become acquainted with physical 
applications of mathematics. As a rule, these two groups approach the 
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reading of a book with different backgrounds. Therefore, to provide both 
types of readers with the minimum amount of material necessary to master 
the basic sections of the book, we have expanded the introductory part 
somewhat. Two long chapters (the first and the second) are devoted en- 
tirely to preliminary material and the treatment of the problems of quan- 
tum statistics does not begin until Chapter III. 

The first chapter contains a discussion and complete proofs of those 
limit theorems of the theory of probability which are used in the main 
sections of the book. We refer here to local limit theorems for sums of iden- 
tically distributed random variables that can assume only non-negative 
integral values. It is well-known that the general conditions for the appli- 
cability of theorems of this type were found only quite recently by B. V. 
Gnedenko and his students. Chapter I contains complete proofs of the 
local theorems for the one-dimensional and two-dimensional cases. The 
fundamental method of Gnedenko is used in these proofs. However, in 
view of the applications to be made of these theorems, the calculations 
are carried out in somewhat more detail in order to obtain not only asymp- 
totic formulas, but also accurate estimates of the remainder terms. Thus, 
this chapter contains a certain element of novelty even for a mathematician 
whose specialty is the theory of probability. For mathematicians of other 
specialties, and also for physicists, it will doubtless be completely new. 
Readers who are not interested in the details of the proofs of the limit the- 
orems should not read the first chapter thoroughly but merely become 
acquainted with the statements of the theorems which are given at the 
end of §§4 and 5. 

The second chapter introduces the necessary preliminary concepts of 
quantum mechanics. The educated physicist will, as a rule, find it super- 
fluous. We suggest that he only glance at it to familiarize himself with the 
terminology and the system of notation used in the remainder of the book. 
The mathematician will probably find it necessary to read this chapter. 
However, we must caution him that familiarization with its contents can- 
not replace a preparatory mastery of the fundamental ideas of quantum 
physics which can even be obtained from literature of a more or less popu- 
lar character. Chapter II can not be considered either as a short course or 
as a synopsis of a course in quantum mechanics. The choice of material is 
not intended to be exhaustive, but is determined solely by an interest in 
the special problems discussed in the remaining chapters. In particular, 
the second chapter is concerned almost exclusively with the mathematical 
apparatus of quantum mechanics. The physical content of the subject 
has not been emphasized. From a formal point of view this chapter con- 
tains everything necessary to understand the following sections of the 
book. However, for the reader who is totally unacquainted with the ideas 
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of quantum physics, it will be of little value: A knowledge of its purely 
formal content will only leave the reader in mid-air. (It is sufficient to point 
out that in the entire chapter not a single experiment is mentioned.) We 
repeat, therefore, that the mathematician approaching the study of our 
book must have at least a modest acquaintance with the general ideas of 
quantum physics. As we have said, this acquaintance may be obtained 
from very elementary sources. If the principal physical ideas of quantum 
physics are already known to the reader, then our second chapter will easily 
raise this knowledge to the mathematical level necessary for an understand- 
ing of the following chapters. 

The third chapter contains an exposition of the general ideas and the 
basis of the computational methods of quantum statistics. In classical 
mechanics the statistical theory is used primarily to investigate the sta- 
tistics of various physical quantities associated with a system of given total 
energy. Similarly, in quantum statistics the distribution laws of various 
physical quantities are studied for a system that has a definite total energy. 
Thus, at least at the outset, only those states of a system will be considered 
in which the total energy has a definite value. These states are described 
by eigenfunctions of the total energy operator 3€. 

In classical mechanics the set of states in which the total energy of the 
system has a given constant value forms some “‘surface of constant energy” 
in the phase space. In quantum mechanics the analogous set of states is 
the linear manifold M, whose elements are eigenfunctions of the operator 
e belonging to some definite eigenvalue of this operator. For systems con- 
sidered in statistical physics this manifold always has a finite but very 
large dimension {a high degree of degeneracy (multiplicity) of the eigen- 
values]. In classical mechanics mean values of physical quantities are ob- 
tained by averaging over a given surface of constant energy (or parts of it, 
if in addition to the energy integral there are other single-valued integrals 
of the motion). Similarly, in quantum statistics the averaging is performed 
over the manifold Wor over parts of it. In fact, as we remarked in §1, it is 
necessary to make a significant reduction in this manifold for the majority 
of systems considered in statistical physics. In these cases only symmetric 
or antisymmetric eigenfunctions are admissible. Thus, in the statistical 
problems of quantum physics it is necessary to develop computational 
methods for three fundamental statistical schemes: complete, symmetric 
and antisymmetric. For this purpose, we establish first a particular com- 
plete orthogonal system of eigenfunctions for each of these three schemes. 
These functions have great importance for all that follows, and we call 
them the fundamental eigenfunctions. The states which are described by 
these eigenfunctions are called the fundamental states of the system. 

Further, we introduce the notion of ‘occupation numbers” which is of 
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basic importance in quantum statistics. Each of these specifies the number 
of particles of the system which is found in a particular state. The funda- 
mental states chosen are especially convenient for statistical calculations 
because in each of these states the occupation numbers have definite (fixed) 
values. Thus, some definite set of occupation numbers corresponds to each 
fundamental state in any of the three statistical schemes. Conversely, one 
or several fundamental states correspond to each set of the occupation num- 
bers. The number of fundamental states corresponding to a given set of 
occupation numbers is different for the three basic statistical schemes. 
This difference is the most important consequence of the statistical dis- 
similarity of these schemes. 

Many of the most important physical quantities studied in statistical 
physics have a “sum” character, i.e., they are sums of quantities each de- 
pending on the state of only one of the particles which compose the system. 
The mean value of a “sum function” can be written down immediately 
from a knowledge of the mean values of the occupation numbers. If in addi- 
tion to the mean values of the occupation numbers we are able to find the 
mean values of their pairwise products, then we can immediately write the 
dispersion of an arbitrary sum function. These facts explain why authors 
of systematic expositions of quantum statistics consider the determination 
of the mean values of the occupation numbers to be their most important 
initial task. It should. be noted, however, that the mean values of the occu- 
pation numbers determine directly the mean values only of sum functions. 
Even though the sum functions are the most important functions they do 
not exhaust all quantities which can be of interest in statistical physics. 
Any quantity which depends symmetrically on the states of the particles 
which compose the system can be of interest in statistical physics. While 
sum functions are the simplest and most frequently encountered of these 
symmetric functions, they evidently do not exhaust the set. (Thus the dis- 
persion of a sum function is symmetric but is obviously not a sum func- 
tion.) From the mathematical point of view it would no doubt be an in- 
teresting and worthwhile task to consider a broader class of problems. 
However, we must note that the limit laws for symmetric functions of a 
large number of random variables are still completely undeveloped. [Ep1- 
TOR’S Note: In several later articles Khinchin did consider a broader class 
of symmetric functions. This work has been included here as Supplements 
V and VL] 

At the end of the third chapter we show that the problem of establishing 
the suitability of microcanonical averages can be reduced to that of esti- 
mating the microcanonical dispersions of the corresponding physical quan- 
tities. In particular, we derive an expression for the dispersion of sum func- 
tions which is valid for all three statistical schemes. 
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After establishing the foundations of the statistical methods of quantum 
physics, we give a concrete structure to quantum statistics in the fourth 
and fifth chapters. The fourth chapter is devoted to the statistics of pho- 
tons, and the fifth to the statistics of material particles (i.e., particles with 
non-zero “rest mass”). We start with photons solely for pedagogical rea- 
sons. It is well-known that the number of photons constituting a given 
system is not constant, but can change with time. This makes the statistics 
of a “photon gas” substantially simpler than the statistics of systems con- 
sisting of material particles. Therefore, we develop all the computational 
methods first using this simplest example for which a one-dimensional limit 
theorem suffices. We hope that the reader masters this chapter before pass- 
ing to the more complicated case of material particles. He will then be ac- 
quainted with the fundamental ideas of the method and the purely tech- 
nical complications encountered in Chapter V will not cause him any great 
difficulty. 

The derivation of the fundamental computational formulas is carried 
out in completely parallel fashion in these two chapters. The dimension of 
the linear manifold of eigenfunctions of the operator 3¢ which belong to a 
given eigenvalue Æ is a function of E, and is called the structure function 
of the system. (In the case of material particles the structure function 
also depends on the number of particles composing the system.) The first 
step of the derivation is to determine the exact expressions for the mean 
values of the occupation numbers and their pairwise products in terms of 
the structure function. These expressions are very simple but are different 
for the different statistical schemes. They enable us to reduce completely 
the problem of finding asymptotic estimates of the mean values of the oc- 
cupation numbers and their pairwise products to that of finding approxi- 
mate expressions for the structure function. The second step is to express 
the structure function for each case in terms of the distribution law of a 
random variable which is defined as the sum of a very large number of 
mutually independent and identically distributed random variables. In 
general, these distribution laws are multi-dimensional. Only in the prob- 
lem of photons are they one-dimensional. Finally, in the third and last 
step of the computation the limit theorems derived in Chapter I are applied 
to obtain asymptotic expressions for these distribution laws. This gives us 
convenient, and at the same time very accurate, approximate expressions 
for the structure function and, hence, for the mean values of the occupation 
numbers and their pairwise products. Using these expressions and the 
methods developed in Chapter IIIT, we easily find approximate expressions 
for the mean values and the dispersions of sum functions which are just. as 
accurate. The precision obtained in this case turns out to be perfectly ade- 
quate to establish the suitability of microcanonical averages of physical 
quantities which are sum functions. . 
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In the sixth and last chapter the results found previously are used to 
define the concept of entropy and to establish by statistical methods the 
basis of the second law of thermodynamics and, hence, of all of thermo- 
dynamics. 

In the “supplements” which complete the book, we consider several 
topics of significant interest that are somewhat peripheral to the main line 
of development of the theory. Therefore, we prefer to assign to them a 
special place at the end of the book. 

We must still add some general remarks concerning the method of ex- 
position adopted in this book. These remarks are necessary for a broader 
understanding of the material. 

1. In the best expositions of statistical physics following the first works 
of Darwin and Fowler, basic use has been made of the mean values of phys- 
ical quantities rather than of their most probable values. Prior to this period, 
the most probable values were invariably used. In regard to this change, 
we will mention two very basic facts: i) In those cases where the mean 
and most probable values of quantities are considerably different from 
each other, the mean values always play the deciding role in descriptions 
of macroscopic phenomena. ii) The methods adopted in statistical physics 
for the calculation of the most probable values always deserve the valid 
criticism of being mathematically incomplete. On the other hand, the 
method of calculation of mean values, formulated by Darwin and Fowler, 
is faultlessly rigorous in mathematical respects. 

In the following, we always speak of the mean values of physical quan- 
tities. It is true that for most quantities of interest in statistical physics, 
the differences between the mean and most probable values are sufficiently 
small so that in practice they may be neglected. However, rather than dem- 
onstrate rigorously that these differences are negligible (a conclusion con- 
sidered obvious by those who use most probable values) we present a suffi- 
ciently well-developed theory of mean values so that a consideration of 
most probable values is unnecessary. 

The sole advantage of the (mathematically unrigorous) calculations of 
most probable values consists in their incontestable relative simplicity. 
There is no doubt that the method of Darwin and Fowler, which is based 
on a specially constructed analytical apparatus, is very complicated mathe- 
matically. This explains its relative unpopularity among physicists. But, 
as we mentioned in the Preface, the primary purpose of this book is to show 
that there is no need for a special analytical apparatus to justify rigorously 
the methods of calculation of mean values of physical quantities. The cal- 
culation leads to a completely elementary application of general and well- 
known limit theorems of the theory of probability. Thus, the last purely 
practical objection to the transition from most probable values to mean 
values can be eliminated. 
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2. The reader who is acquainted with this subject will probably notice 
that in our book, in contrast to the majority of contemporary expositions 
of this subject, no mention is ever made of the so-called “statistical en- 
sembles”. The systematic use of this term usually indicates that the phys- 
ical system being studied is considered as an element of some set of physical 
systems which have the same structure as the given system but are in dif- 
ferent states. However, it should be understood that in all statistical the- 
ories, a physical quantity is given a value which is obtained by averaging 
the quantity over all different states of the observed system. Such a value, 
in general, corresponds to the mean value of the results of a large number 
of experimental observations of the quantity. This connection of the sta- 
tistical theory with the physical world is discussed in great detail and is 
often emphasized throughout our book. It appears to us, however, that the 
systematic assignment of a physical system of the same structure to each 
of the possible states of the given system (thus forming the “ensemble” 
introduced by many authors) is completely superfluous and only hinders 
an understanding of the theory. We prefer to consider the set of states 
which are possible states for the system (a phase space in classical physics) 
and not to consider a set of systems which are assigned to these states. 
The latter approach only complicates the picture of the phenomenon. This 
same point of view was introduced in our book on the mathematical found- 
ations of classical statistical mechanics. Instead of assigning a whole ‘“en- 
semble” of systems of the same type to the phase space and then following 
the evolution of this “ensemble”, we simply spoke of the “natural motion” 
of the phase space itself. (This can be thought of as a space which is con- 
tinuously being transformed onto itself, the motion being like that of a 
simple hydrodynamic model.) This description is simple and convincing 
from both the mathematical and the physical points of view. Only the un- 
necessary assignment of some physical system to each point of the phase 
space is lost. 

3. In keeping with our pedagogical aim, we choose only the simplest 
examples and consciously refrain from considering more complicated situa- 
tions so that the reader can concentrate all of his attention on the mathe- 
matical method. In the text proper we limit ourselves to homogeneous sys- 
tems (i.e., systems consisting of particles of the same structure) and only 
in Supplement I do we show how our method can be applied to hetero- 
geneous systems. The particles composing each system are assumed to be 
enclosed in a vessel of constant volume. It is well-known that this leads 


to a discrete energy spectrum 
a< f2< eee SS Epes 


for the particles. As is usual in such investigations, we also assume that the 
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energy levels £, of the particles are integers. In practice, all the energy levels 
can be made to approximate integers as closely as desired by choosing a 
sufficiently small unit of energy. Further, as is also customary, we assume 
that the energy of the system is equal to the sum of the energies of its con- 
stituent particles. This means that we neglect the interaction energy of 
the particles; i.e., strictly speaking, we are limiting ourselves to multi- 
atom ideal gases. It is, of course, impossible to follow this point of view to 
its logical conclusion, since in the absence of an interaction between them, 
the particles cannot exchange energy, and the whole statistical problem 
becomes meaningless. Usually, as a result of this difficulty, it is assumed 
(in many cases with good reason) that the interaction between the particles 
is sufficient to guarantee a free exchange of energy between them, but at 
the same time is weak enough so that for all practical energy calculations 
it is possible to equate the energy of the system to the sum of the energies 
of its constituent particles. Thus, the ‘‘mixed”’ terms which express the 
interaction energy of the particles can be neglected. 

The primary purpose of our method is to deduce all the asymptotic for- 
mulas necessary for quantum statistics. As usual, these formulas are derived 
on the assumption that the number of particles N of the given system, its 
total energy E, and its volume V are infinitely large quantities which main- 
tain constant ratios (a constant mean energy per particle and a constant 
mean density of the gas). In essence, this means that units of energy and 
volume are chosen so that the ratios E/N and V/N are neither too small 
nor too large. All quantities characterizing the given system which depend 
only on these ratios must therefore be regarded as constants in our asymp- 
totic formulas. 


Chapter I 


PRELIMINARY CONCEPTS OF THE THEORY 
OF PROBABILITY 


§1. Integral-valued random variables 


This book is concerned with the rigorous and detailed mathematical 
bases of the most important formulas of quantum statistics. These are 
established with the help of the limit theorems of the theory of probability, 
since the question of limit theorems of some particular type arises in all 
cases. For a long time these theorems have been of interest to specialists 
and in recent years they have been developed significantly, particularly by 
mathematicians in the U. S. S. R. Nevertheless they are not, as a rule, dis- 
cussed in textbooks and consequently are little known to a wide circle of 
scholars. (As an exception we may mention the book by von Mises [3].) 
Hence, in the present chapter we give both detailed formulations and com- 
plete proofs of the limit theorems which are necessary for our development. 
We assume only that the reader is acquainted with a general text such as 
Feller [4]. 

The type of limit theorem we need is distinguished by the following im- 
portant specific characteristics: 

1) We always consider random variables all of whose possible values 
are integers; 

2) All the limit theorems of interest to us are of the local type, i.e., we 
always consider an asymptotic estimate of the probability that the sum of 
the random variables being studied assume some definite value; 

3) We can limit ourselves to sums of mutually independent and identi- 
cally distributed random variables that have finite moments up to the fifth 
order inclusive; 

4) In all cases we must find not only an asymptotic formula, but also 
an accurate estimate of the error; 

5) Finally, in addition to the one-dimensional limit theorems, we will be 
equally interested in multi-dimensional limit theorems of the same type 
(in particular, two-dimensional limit theorems). 

Local limit theorems for integral-valued random variables attracted the 
attention of investigators a relatively long time ago, although much less 
effort was devoted to them than to theorems of the “integral” type. Thus, 
in the book by von Mises, one may find rather deep theorems of this latter 
type. However, a sufficiently general formulation of the problems of in- 
terest to us has been achieved only recently. In particular, the limit the- 
orems of the type we require were first proved by B. V. Gnedenko [5] and 
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his students [6]. Although they considered multi-dimensional problems, the 
fundamental direction of their investigations differs significantly in one 
way from that which we need: While Gnedenko and his students always 
sought more general conditions under which the fundamental limiting re- 
lationship is valid, we, as stated above, can confine ourselves to a very 
narrow class of initial distributions. On the other hand, we cannot be satis- 
fied with deriving limiting relationships, but must estimate the resulting 
error, sometimes rather accurately. Thus, although Gnedenko’s methods 
are completely adequate for our purpose, we need formulations of limit 
theorems which are somewhat different from those given by him and his 
co-workers. This is a further reason for our including a chapter containing 
detailed proofs of the limit theorems we require. 

The random variables we must consider in this book always have only 
integers as their possible values. We call such random variables integral- 
valued. Evidently, the distribution law of the integral-valued random vari- 
able ¢ is completely determined by giving for each integer n the probability 


P( = n) = Dn [Ppa > 0(—% <n < o); Dae Dn = 1, 


that the variable — take on the value n. In the future we shall say briefly 
that the variable & obeys (is subject to) the law p, or is distributed according 
to the law pr . 

If the series 


D 
n=—æ NPn 


converges absolutely, its sum is called the mathematical expectation E£ 
of the variable £. (Sometimes, instead of mathematical expectation, the 
term “mean value” of the random variable £ is used. We carefully avoid 
this terminology, since the term “mean value” has a completely different 
meaning in this book.) In general, given an absolutely convergent series 


Date f(n)pn ? 


where f(n) is an arbitrary real or complex function of the integral argument 
n, we call the sum of the series the mathematical expectation Ef(£) of the 
random variable f(£). In particular, the mathematical expectation Eż of 
the variable & (if it exists) is called the moment of order k (kth moment) 
of the variable ¢. The mathematical expectation E(& — E£)" of the variable 
(£ ~ Eé)* (if it exists) is called the central moment of order k of the vari- 
able £. The central moment of second order 


DE = E(é — EE)’ = D2 o (n — EE)’p, 


(if it exists) is called the dispersion (variance) of the variable £ and is, 
along with the mathematical expectation Eż, one of the most important 
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characteristics of this variable. All the random variables we shall consider 
actually possess moments of arbitrary order k > 0. However, we shall see 
that for the proof of the relevant limit theorems, it is sufficient to assume 
the existence of moments of only relatively low orders. 

If £ and $” are integral-valued random variables obeying, respectively, 
the laws p,’ and p,”, then the sum ¢’ + &” = £ is an integral-valued ran- 
dom variable. The distribution law p, of this sum, in addition to depending 
on the laws p’ and p,”, also depends on the form of the mutual depend- 
ence of the variables ¢’ and &”. In particular, if these two variables are 
mutually independent, then the numbers pn are very simply expressed in 
terms of the numbers p,’ and p,”. Indeed, in order that £ = n, it is neces- 
sary and sufficient that } = k, E” = n — k, where k is any integer. There- 
fore, 


P(E = n) = Dir P(t’ =k,” = n — k), 
and in virtue of the mutual independence of the variables £’ and é”, 


Pr = Denis Dil Pank”. 


The last equation may be rewritten as 


Pr = X efie Prp”. 


In the same fashion if we have s mutually independent integral-valued 
random variables 


(1) (2) (3) 
E $ oe 


with the corresponding distribution laws 


(1) (2) (3) 
Pn ,Pn,''*, Pn, 


then the distribution law p, of the sum 
p= Die? 
may be expressed as 
(1) Pa = Dorma Lema + Dini Diy Pia O PAT: De”, 


where a = n — Da ki ; or, equivalently, 


(2) Pn = Poen Diy Pip >e PÉD, 


where 8 = > i.1k;. The expression for the distribution of the sum of 
mutually independent random variables in terms of the distributions of 
the summands is called the rule of composition of these distributions. Thus, 
formulas (1) and (2) express the rule of composition of the distributions 
of an arbitrary number of integral-valued random variables. 
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It is known from the elementary theory of probability that the mathe- 
matical expectation of a sum of an arbitrary number of random variables 
is always equal to the sum of their mathematical expectations. If the sum- 
mands are mutually independent, the analogous law holds true for the 
products. Finally, the dispersion of the sum is equal to the sum of the dis- 
persions of the summands when the summands are pairwise mutually in- 
dependent. 

We must repeatedly consider cases in which the basic element is not one 
random variable, but a family of several (two, three or more) mutually 
dependent integral-valued random variables £, 7, ---. For simplicity of 
notation we consider the case of a pair (4, 7) of such variables. (All that is 
said for this case holds true with the corresponding obvious changes for 
any larger family of variables.) A pair of this type is sometimes called a 
(two-dimensional) random vector. The probability P(é = 1, n = m) of the 
simultaneous realization of the equations § = l and n = m is denoted by 
Pim . The set of numbers Pim (— © < l,m < œ) forms the distribution law 
of the random pair (£, 7). If pan and gm, respectively, denote the distribu- 
tion laws of the variables & and 7, then evidently 


(3) Di = b Snr Pim; an = pee Pim. 
Hence, 
Et = Dres lpi = Doe l Dons Pim, 


and analogously, 
En n Ye m Duys Pim. 


It is assumed that all these series converge absolutely. 
If f(é, n) is an arbitrary real or complex function of the variables £ and 
n, then the quantity 


(4) Ej(é, n) = Tee z-o 108 M) Dim 


is called its mathematical expectation if the double series converges ab- 
solutely. In particular, from formulas (3) or (4) we obtain expressions for 
the dispersion of the variables £ and 7 in terms of the numbers pin : 


DE = E(E — Et)? = Dio Dina (l — EE)*pim , 
Dn = E(n — En)? = Ð e X oo (m — En) Pin. 


ll 


The ratio 
R(é, n) = Ef (f — Et)(n — En)}/( DE Dn)! 
= [E(én) — EfEn)/(Dé Dn)? 
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is called the correlation coefficient of the variables — and n. The numerator 
of this ratio can be written in the form 


E{(é — E£) (n — En)} = Die O22 (l — EE) (m — En) pm. 


From the random pairs (é’, 7’) and (£”, n”) we may form the pair (é’ + 
E”, n’ + n”) which is called the sum of the given pairs. In the same fashion, 
the sum of an arbitrary number of pairs can be defined. Let Pim’, Pim”, 
Dim denote the distributions of the pairs (#’, n’), (£”, 7”), (E +E, n +7”), 
respectively. The numbers Pım are, in general, not completely defined by 
specifying the numbers Pim and Pm”; for this it is also necessary to know 
the dependence between the pairs (¢’, 7’) and (&”, n”). In the most im- 
portant case, when the latter pairs are mutually independent (i.e., when 
the values taken by the variables £’, n’ do not depend on the law pin”, and 
conversely) we easily obtain the rule of composition expressing the num- 
bers pim in terms of the numbers pin’ and Pim”. Moreover, we can obtain 
the rule of composition for the addition of an arbitrary number of (mu- 
tually independent) pairs. These formulas (which we shall not introduce 
here) are completely analogous to formulas (1) and (2), which were es- 
tablished above for the one-dimensional case, but are, of course, substan- 
tially more complicated than (1) and (2). 


§2. Limit theorems 


In the theory of probability, as in every mathematical theory of a natural 
science, such as theoretical mechanics, thermodynamics and many others, 
one tries to establish conformance to the most general principles. These 
principles would relate not only to the particular processes taking place 
in nature and in human practice, but would include the widest possible 
class of phenomena. For instance, the fundamental theorems of mechanics 
— the theorem of kinetic energy, the (Keplerian) theorem of areas, etc. — 
are not related to any special form of mechanical motion, but to an ex- 
tremely wide class of such motions. In the same way, the fundamental 
propositions of the theory of probability (such as the law of large numbers) 
not only include special forms of mass phenomena, but include extremely 
wide classes of them. It can be said that the essence of mass phenomena is 
revealed in regularities of this type, i.e., those properties of these phe- 
nomena are revealed which are due to their mass character, but which 
depend in only a relatively slight manner on the individual nature of the 
objects composing the masses. For example, sums of random variables, the 
individual terms of which may be distributed according to any of a wide 
range of laws, obey the law of large numbers; neither the applicability nor 
the content of the law of large numbers depends upon these individual dis- 
tribution laws (which must satisfy only certain very general requirements). 

One of the most important parts of the theory of probability — the theory 
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of limit theorems — was developed because of this desire to establish general 
principles of a type which includes the widest possible class of real phe- 
nomena. In a very large number of cases — in particular, in the simplest 
problems which arose initially — the mass character of the phenomenon 
being studied was taken into account mathematically by investigating 
sums of a very large number of random variables (more or less equally 
significant and mutually dependent or independent). Thus, in the theory 
of measurement errors (one of the first applications of the theory of prob- 
ability) we study the error actually incurred in performing a measurement; 
this error is usually the sum of a large number of individual errors caused 
by very different factors. The law of large numbers is concerned with just 
such sums of a large number of random variables. In the XVIIIth century, 
De Moivre and Laplace showed that in some of the simplest cases the sums 
of large numbers of mutually independent random variables, after proper 
normalization, were subject to distribution laws which approach the so- 
called “normal” law as the number of terms approaches infinity. The ‘‘den- 
sity” of this law is given by the function 


(2r) te *". 


This was the first of the limit theorems of the theory of probability, the 
so-called theorem of De Moivre and Laplace. It is now studied in all courses 
on the theory of probability. This theorem includes only an extremely nar- 
row class of cases, the so-called Bernoulli trials, where each term has as its 
possible values only the numbers 0 and 1, and the probabilities of these 
values are the same for all terms. However, as was stated by Laplace, the 
causes, due to which distribution laws of sums in the case of Bernoulli trials 
have a tendency to approach the normal law, have a character so general 
that there is every reason to suppose that the theorem of De Moivre and 
Laplace is merely a special case of some much more general principle. Lap- 
lace attempted to find the basis for this tendency to the normal law for a 
wider class of situations. However, neither he nor his contemporaries made 
significant progress in this direction, partly because the methods of mathe- 
matical analysis known at that time were inadequate for this purpose. The 
first method, by which it was possible to prove that the limit theorem is a 
general principle governing the behavior of sums of a large number of 
mutually independent random variables, was not formulated until the 
middle of the XIXth century by P. L. Chebyshev, the great Russian 
scholar. It is well-known that the first general conception of the law of large 
numbers is due to him. In general, the desire to establish principles of wide 
validity, which is common to every natural science and of which we spoke 
at the beginning of the present section, was noticeable in the theory of prob- 
ability only after the investigations of Chebyshev. 

Chebyshev tried to formulate a general limit theorem during almost all 
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of his scientific life. He finally found a suitable formulation, but did not 
succeed in proving the theorem itself. The proof was completed shortly 
after Chebyshev’s death by his student and successor A. A. Markov. How- 
ever, several years before the work of Markov, A. M. Liapunov, who was 
also a student of Chebyshev, proved the limit theorem under extremely 
general conditions by a different method, which more closely resembles the 
contemporary proof. 

The central limit theorem, first proved by Liapunov and later refined 
by other investigators, asserts that under certain general conditions the 
distribution law of a suitably normalized sum of a large number of mutually 
independent random variables should be close to the “normal” law intro- 
duced above. Here we are speaking of the so-called “integral” laws: If this 
normalized sum is denoted by S, then the probability of the inequality 
S < x must be close to 


(2r) [ eo du. 

It is immediately evident that in the general case, where the type of dis- 
tribution laws of the summands themselves is unknown, any other formula- 
tion of the problem is impossible. Thus, for example, if the summands are 
integral-valued random variables, their sum may assume only integral 
values. To pose the question of the limiting behavior of the “density” of the 
distribution law of the sum would be meaningless in this case. 

However, even though in the general case of arbitrarily distributed ran- 
dom variables only limit theorems of an “integral” type have meaning, it is 
still possible that if the terms of a given sum obey certain special types of 
laws, then “local” limit theorems may hold. These theorems can then give 
an approximate expression either for the probabilities of individually pos- 
sible values of this sum or for the density of its distribution law, depending 
upon whether the summands obey discrete (in particular, integral-valued ) 
or continuous distribution laws. Local limit theorems of both types turned 
out to be very important. Consequently, considerable attention has been 
given to their proof. For our purpose, the mathematical foundations of 
quantum statistics, only local limit theorems for the case of integral-valued 
random variables are necessary. Therefore, we consider only local limit 
theorems for integral-valued random variables and ignore the case of con- 
tinuous distributions. 

Suppose that we have an integral-valued random variable £ subject to 
the distribution law p. The possible values of & are those integers k for 
which p; > 0. The set of all pairwise differences of these possible values of 
the variable ¢ has a greatest common divisor d. It is clear that if ao is one 
of the possible values of £, any arbitrary possible value of £ may be repre- 
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sented in the form ao + ld, where / is an integer. The converse is in general 
not true: a + ld need not be a possible value of £ for an arbitrary integer l. 
However, those numbers } for which ap + ld is a possible value, evidently 
form a (finite or infinite) set of integers with the greatest common divisor 1. 

Now let us assume that we have n mutually independent random varia- 
bles &, &,°--, Ën, each subject to the distribution law described above. 
Then the sum s, of these variables evidently assumes only values of the 
form nao + ld, where l is an integer. The problem of establishing a local 
limit theorem in this case consists in finding a suitable approximate expres- 
sion for the probability 


P(s, = na + ld) 


for different values of J. 

In the modern theory of probability limit theorems (of both the integral 
and the local type) are extended to sums of random vectors. One tries to 
prove that under very general conditions the distribution laws of such sums, 
after a suitable normalization, approach the “normal” law in the proper 
number of dimensions when the number of summands approaches infinity. 
Assume, for example, that (&;, n:) (i = 1, 2, ---, n) are mutually inde- 
pendent integral-valued random vectors obeying the same law 


po = P(E: = a, q: = b). 


Each pair of integers (a, b) represents a point in the plane having in- 
tegral-valued Cartesian coordinates (a, b). Such points for which pa > 0 will 
be referred to as lattice points. These “possible” points of the vector 
(€:, n:) form a set of lattice points M in the plane. This set is a subset of 
the set of all lattice points of the plane. In general, it is possible (and in 
our applications it will often happen) that there exist in the plane other 
coarser parallelogram lattices of points covering the set M of possible points 
of our vector. The set of points of an arbitrary parallelogram lattice of the 
plane may be represented in the form i 


ao + ka + lB 
y = bo + ky + lê, 


where ay, bo, a, 8, y, ô are integral constants, d = aô — By ~ 0, and k 
and / range over the complete set of integers. Every parallelogram whose 
vertices belong to this lattice and which contains no other points of the lat- 
tice in its interior or on its boundary is called a fundamental parallelogram 
of the lattice. The given lattice may have fundamental parallelograms of dif- 
ferent form, but their area is always the same and is therefore a fundamental 
characteristic of the parallelogram lattice itself. This area, as may easily be 


It 


x 


(5) 
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calculated, is |d| = |a — 8y |. The parallelogram lattice covering the 
set M is called a maximal lattice if the area of its fundamental parallelogram 
has the largest possible value. This maximal area of a fundamental parallelo- 
gram plays the same role in the theory of two-dimensional random vectors 
as the number d, introduced above, plays in the one-dimensional case. 

We shall call the integral-valued random vector (£, n) degenerate if all 
of its possible points lie on one straight line, i.e., if £ and 7 are linearly de- 
pendent. We now prove that every non-degenerate vector has a maximal 
lattice. In fact, if the vector (£, n) is non-degenerate, then it has at least 
three possible points which do not lie on a straight line. We adjoin to them 
a fourth point so that a parallelogram P is obtained. Let the area of P be s. 
It is evident that P will be a parallelogram in any parallelogram lattice 
which covers the set M. Consequently, the fundamental parallelogram of 
such a lattice cannot have an area larger than s. It follows immediately 
that a maximal lattice can be found among such covering lattices [7]. 

If the distribution law of the random vector (£;, n:) (i = 1, 2,---,7) 
has the covering lattice (5), and if we assume that 


Dimt=S., Lhaw= Tr, 
then only points (x, y) of the form 
x = Nna + ka + lp 
y = nbo + ky + lô, 


where k and l are integers, can be possible points of the vector (S, , Ta). 
Here the problem of establishing a local limit theorem consists in finding a 
convenient approximate expression for the probability 


P(S, = nao + ka + 18,T, = nbo + ky + lô) 


for given integers k and l. 

The remainder of this chapter will be devoted to establishing limit theo- 
rems of a similar type. However, in addition to finding the above-mentioned 
approximate expressions, we must also, for all cases of interest in subsequent 
applications, find an accurate estimate of the error caused by the replace- 
ment of the desired probability by an asymptotic expression. 

In concluding the present section we make several historical remarks con- 
cerning the development of the study of limit theorems that followed the 
first discoveries of Liapunov and Markov. This development continued 
(and continues to the present time) in the following fundamental directions: 
1) the extension of the limit theorems to multi-dimensional random varia- 
bles (random vectors); 2) the proof of local limit theorems (of continuous 
and discrete types) ; 3) the extension of limit theorems to sums of mutually 
dependent random variables; 4) the search for the broadest possible (neces- 
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sary and sufficient) conditions for the applicability of limit theorems; and 5) 
the refinement of the estimate of the remainder terms. In this exposition we 
limit ourselves to problems in which the limiting laws are normal. Of course, 
there is also the question of what other laws under what conditions can 
serve as limiting laws for sums of a large number of random variables. 
While this question has led to a particularly well-developed theory, we are 
not able to discuss it here [8]. 


§3. The method of characteristic functions 


The most convenient method now known for proving the limit theorems 
of the theory of probability is the so-called method of characteristic func- 
tions. This is particularly so when the terms of the sum are mutually in- 
dependent. We present the fundamentals of this method as applied to the 
case of interest to us, namely that of integral-valued random variables. 

Let & be an integral-valued random variable obeying the law 


Pa = P(E = n) (-—xw <n < œ). 


If f(€) is an arbitrary (real or complex) function of the variable ¢ and if the 
series 


Dio f(n)pn = Ef(é) 


converges absolutely, then the sum of this series is what we agreed in §1 
to call the mathematical expectation of the random variable f(é). Let us 
assume, in particular, that 


fe =e, 


where i = (— 1)! and ¢ is an arbitrary real number. Since the series 


n 


tp, 
evidently converges absolutely for all real ¢, the mathematical expectation 


Ee! = bs ae pre” 


of the complex random variable e““ exists for all real ¢ and is a function of 
the real parameter t. We denote this function by g(t) and call it the char- 
acteristic function of the variable ¢ or of the distribution law pn . Thus, 


(6) olt) = nope 


We now mention some of the simplest general properties of characteristic 
functions. 
1°. Evidently, we always have 


(0) = Jacas Pa = 1, 


Ca $ 
n=—w0 @ 
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and for real ¢ 
l(t) | < Dt pas = 1. 


2°. If the variable ¢ has the mathematical expectation Eż, then the latter 
can be represented by the series 


Spee NPr » 


which converges absolutely. In this case the series 

bee inpre”, 
which is obtained from the series (6) by means of termwise differentiation 
with respect to t, obviously converges absolutely and uniformly over the 


whole real line. Thus, the differentiability of the characteristic function ¢(¢) 
for arbitrary ¢, and also the relation 


p'(0) = iE, 
follow from the existence of the mathematical expectation Eé of the varia- 
ble £. 
3°. In completely analogous fashion we easily convince ourselves that 
the existence of the kth moment of the variable £ (where k is an arbitrary 


positive integer) implies the kth order differentiability of the function ¢(t), 
and also the relation 


™® (0) = TEE. 


4°, From equation (6) it follows immediately that the characteristic 
function of an integral-valued random variable is always a periodic func- 
tion, and that it always has a period (not necessarily the smallest) equal 
to 2r. 

5°. Let £’ and &” be two mutually independent integral-valued random 
variables obeying, respectively, the distribution laws p,’ and p,”, and hav- 
ing the characteristic functions ¢,(¢) and g(t), so that 


a(t) = paar pie’, e(t) = > pe ae 
Let p, and g(t) denote, respectively, the distribution law and the charac- 
teristic function of the sum £ = £ + ¢”. Then, as we have seen in §1, 

Dn = Pio Pk’ Pat”, 

and consequently 

o) = Bno pre = Dao e"! Doim- Pr Pae” 
= Dr-o (keo pre pe) 
El o pre"! Roo prae ™! 
Deea pre” D ieo pie = plt)elt). 
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If we considered as proved the theorem that the mathematical expecta- 
tion of the product of mutually independent random variables is equal to 
the product of the individual expectations, then the result just obtained 
follows very simply from 


elt) = Ee" aa Bette He) 
E (e'e) 


2 Ee’ Ee"! = gi (t) g(t). 


Thus, the characteristic function of the sum of mutually independent random 
variables is equal to the product of the individual characteristic functions. This 
rule of composition of characteristic functions was proved for the addition 
of only two random variables. Evidently, by using mathematical induc- 
tion, the proof can be extended immediately to the case of the sum of an 
arbitrary number of (mutually independent) random variables. 

This exceptional simplicity of the rule of composition of characteristic 
functions makes them an extremely convenient instrument for investigating 
sums of large numbers of mutually independent random variables, and, in 
particular, for proving limit theorems. On the other hand, the rule of com- 
position of the distribution laws themselves, given by formulas (1) and (2) 
of §1, is quite complex, particularly for a large number of summands. For 
characteristic functions this rule, as we see, is distinguished by extraordi- 
nary simplicity, so that knowing the characteristic functions of the sum- 
mands we can immediately and directly form the characteristic function 
of the sum. 

However, for the characteristic functions to be a sufficiently powerful 
tool for the investigation of sums of a large number of random variables, 
one simple rule of composition is not enough. After finding the characteris- 
tic function of the sum, we should be able to establish, with its help, the 
distribution law of this sum. At present, we have related the distribution 
law to the corresponding characteristic function only through formula (6) 
which expresses g(t) in terms of pa. We not only do not have the inverse 
relation which gives p, in terms of g(t), but in essence we do not even know 
if the characteristic function g(t) defines the corresponding distribution 
Pn uniquely. All these questions, as we now show, are easily resolved by 
means of the classical formulas of Fourier. 

6°. We multiply both sides of equation (6) by e-‘”’, where m is an arbi- 
trary integer, and integrate the expression obtained with respect to ¢ from 
— to r. The series on the right side, being uniformly convergent in the 
range indicated, may be integrated termwise. We find 


f e y(t) di = > pa f eam! dt, 


—imt 
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or, since on the right side the integral is obviously equal to 2r for n = m 
and to zero for n # m, 


f e” oA) dt = 2rPm. 


Hence, 


(7) pm = (20) f l e ™”to(t) dt. 


This “inversion formula” shows that the distribution law of an integral- 
valued random variable is uniquely determined by its characteristic func- 
tion g(t). At the same time formula (7) gives an extremely simple expres- 
sion for this dependence. 

7°. In §1, we wrote the whole set of possible values of the random variable 
£ in the form a) + ld, where d is the greatest common divisor of the pair- 
wise differences of these values, ao one of the possible values, and J an inte- 
ger. In the sequel we agree to call the number d, which is uniquely deter- 
mined by the distribution law of the variable & the increment of this 
variable. If, as previously, we denote the distribution law of the variable 
E by pn , then evidently p, > 0 only if 


n = a (mod d), 


i.e., if the number n has the form ao + ld. Therefore, in contrast to what was 
done previously, it will be convenient to denote by p: the probability that 
E = a + ld. In this notation the characteristic function of the variable £ 
is expressed by the formula 


g(t) = Jiag ject = eit? ae pert. 


We immediately see that the function 
o(t)e 2" = ee ep, 


has period 27/d. A calculation carried out in analogy with that of 6° easily 
shows that 


ald , 
(8) pı = (d/2r) f Pi err) dt. 


The great value of the concept of the increment d of an integral-valued 
random variable is explained by the following lemma: 

Lemma 1. Let & be an integral-valued random variable having increment d 
and characteristic function g(t). Then, | o(t)| < 1 for0 < t < r/d. (The 
presence of the increment indicates that the random variable & is non-de- 
generate, i.e., that it has no less than two possible values.) 
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Indeed, if for an arbitrary value ¢ we have 


let) | = 
then 


| ui a pe | = | o(te "| = 1. 


Since J`% p: = 1, the above is possible only if all members of the sum 
for which p, > 0 have the same argument 9, i.e., if pı > 0 implies that 


ldt = 0 + 2ru., 
where u is an integer. Suppose pr > 0 and pr > 0. Then 
(U — V) d = (ur — ur)(2r/t). 
But (l” — l')d = (a + l” d) — (a + Ud) is the difference between 
any two possible values of £. Thus, all such differences are integral multiples 


of 2r/t. This means that the greatest common divisor of these differences, 
by definition equal to d, is an integral multiple of 2r/t. Hence, 


d = s(2r/t), 
where s is an integer. Therefore, 
t = 2rs/d. 
Since the region 0 < | ż¿! < r/d does not contain such values of t, it follows 
that | (t) | < 1 in this region, which was to be proved. 
The method of characteristic functions can be generalized in a natural 


fashion to treat multi-dimensional random variables (random vectors). Let 
(£, n) be an integral-valued random vector subject to the distribution law 


Pa = P(E = a, n = b). 


The mathematical expectation of the variable e“***” is equal to 


Ee Etm o% path 
ae Pab 5 


where ¢ and s are real parameters. We will call this the characteristic func- 
tion of the vector (&, 7) or of the law pa , and designate it by (t, s), so that 


oll, 8) = Dir» pase t., 


As in the one-dimensional case, the function (t, s) is periodic with period 
2r relative to each of the two parameters; ¢(0, 0) = 1, and for any real 
t, s, we have | ¢(¢, s) | < 1. 

Further, if the moment E(én') exists, then the partial derivative 


g" +e /at'as* 
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also exists, and the value of this derivative for t = s = 0 is equal to 


PECE"). 

If (& , m) and (£, m) are two mutually independent integral-valued 
random vectors with the characteristic functions ¢,(t, s) and ¢:(t, s) and 
if g(t, s) is the characteristic function of the vector (& + &,m + m2), then 

e(t, s) = p(t, s)e:(t, s). 


This rule of composition remains valid for the addition of an arbitrary 
number of mutually independent random vectors. 
Further, the “inversion formula” 


Pa = (1/4r°) if i et) Ot s) dt ds 


holds. 

All of these results are established in complete analogy with the one- 
dimensional case and with the same ease. 

Now let the law pa have the maxima! lattice 


a = @ + ka + ig, 
b = ba + ky + lô, 


where d = aô — By ¥ 0. Then pa can be different from zero only if a and b 
have the form (9) for some integers k and l. Therefore, as in the one-di- 
mensional case, we change the notation somewhat and set 


Pir = P(E = a + ka + 18, n = bo + ky + lô). 


For the characteristic function of the vector (£, 7) we thus obtain the ex- 
pression 


(9) 


ra) i[t(ag tkatlB) +s(by thy +18)) 
olt, 8) = D kizo prie Or Pee 


(10) =e etto tbo) Jori ppe EHA +s(ky+18)) 
tag =e . 


Our next problem is to find the “inversion formula” expressing the law 
px in terms of the function g(t, s). This will be analogous to formula (8) 
of the one-dimensional case. For this purpose, we choose an arbitrary pair 
of integers (k’, I’) and multiply both sides of (10) by 


go lilok’ atl’ P) te(bo th’ y+’) 
We then take the integrals of both sides, extending them over the region D 
of the plane (t, s), characterized by the inequalities 
—r Ll al + ysr, 
=r < bt + ôs LT, 


(D) 
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which evidently represent a parallelogram of area 4x°/| d | with its center 
at the origin. We obtain 


eT re hee 
[fe ilttaotk’ atl Brtetbott' YEDI LC S) di ds 
D 


eroine T ik s 
= L Pri fer Yat (LL) BY t+ (kk!) (1-2) 8] 8) dt ds. 
D 


On the right side of this equation the integral corresponding to the values 
k = k', l = l’ is evidently equal to the area of the region D, i.e., 4r°/| d |. 
We now show that all the remaining integrals on the right side vanish. 
We choose any one of these integrals and transform the variables in it by 
setting 


at + ys 
Bt + ôs 


Then the integral assumes the form 


T T 
| | eile ut (lol md aed 
-T -F 


u, 


ll 


v. 


T T 
= (|d pa p EEs du f eY d, 
-=y 


-Tr 


and clearly vanishes if at least one of the two differences k — k’, l — I’ is 
different from zero. Thus we find 


i g ONFE EKE BROTER TREDI (i s) di ds = (4r°/ld prv. 
D 


Hence, dropping the primes over the letters, we obtain for an arbitrary pair 
of integers k, l, 


(11) p = (|d|/4r*) I gt AE Core Ch e)dbde: 


This is the inversion formula we shall need. 

Finally, to prove the two-dimensional limit theorem, it is very important 
to establish the propostion analogous to Lemma 1: If the lattice (9) is 
maximal, then | y(t, s)| < 1 everywhere within and on the boundary of 
the region D, except for the point t = s = 0. 

Indeed, let (t, s) be an arbitrary point of the plane for which 


lelt, s)| = 1. 
Then, by (10), 
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‘ E Pi a he f} 7 4p 5 
(12) | poe: ppe nthe B) +s(69- hy +15)] | os 1. 
Since 


Dia Per = 1, 


all terms in the sum (12) in which p,, > 0 must have the same argument 
(to within multiples of 2r). Thus, if px, > 0, then 


(ao + ka + 18)t + (bo + ky + lô)s = 8 + 2mr, 


where @ is a constant and m is an integer (depending on k and /). Hence, 
an arbitrary possible point (a, b) of the vector (£, 7) satisfies the relation 


at + bs = 0 + 2mr, 


with m an integer. 
Let us consider the points 


t = (20/|d|) ô, sı = —(2r/ |d |), 
t = —(2r/ |d| )y, s= (27/|d]|)a 


of the plane (¢, s). It is clear that these points cannot be connected to the 
origin by the same straight line. It is easily verified that 


lelt, 8) | = lgl, s2) | = 1. 


Let us assume now that | (t, s) | = 1 at some point (t, s) different from 
the origin. As we just proved, an arbitrary possible point of the vector 
(£, 7) then lies on one of the system of parallel lines 


(13) at + ys = 6+ 2mr, 


where m is an integer. Since (¢, , sı) and (t: , s2) lie on a straight line which 
does not intersect the origin, the straight line passing through the origin 
and the point (t, s) cannot contain both points (é, sı) and (tz, sa). For 
definiteness, let (4, sı) lie outside this line; then 4s — st = 0. Since 


lelh, sı) | = 1, all possible points of the vector (£, n) belong, in addition 
to the system (13), to the system of straight lines 
(14) at, + yA = GA + 2TN, 


with n an integer. The points of intersection of the systems of straight lines 
(13) and (14) form a lattice which, as we have shown, covers the set M. 
An easy calculation shows that the area of the fundamental parallelogram 
of this lattice is 

(15) 4n’/ | sit — hsl. 


If the point (t, s) lies, as we now assume, within or on the boundary of 
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the region D, then by the definition of this region, | 8t + ôs | < m. Hence, 
|sat — ts | = (2r/ |d|) | BE + ôs| < 2r°/|d]. 


It follows that the area (15) of the fundamental parallelogram of the lat- 
tice just constructed must be >2 |d | . This is impossible, since this lattice 
covers the set M for which | d | is the area of the fundamental parallelogram 
of the maximal lattice (9). Thus, we have proved the following proposi- 
tion: 

Lemma 2. Suppose that all possible points (a, b) of the random vector 
(£, n) are covered by the maximal lattice 


a a + ka + lB, 
b = by + ky + lô. 


Then | y(t, s)| < 1 everywhere within and on the boundary of the region D 
defined by the inequalities 


Jat +ys| <x, 
|6t+ ss] < x, 
except al the point t = s = 0. 


§4, The one-dimensional limit theorem 


Let &, &,-°--, & be mutually independent integral-valued random 
variables subject to the same distribution law 
pi = P(E; = a + ld) (—% <1< œ), 


where a is one of the possible values and d is the increment of the variable 
£;. Let us set 


Ei tb +--+ + En = Sh. 


Evidently, the random variable s, can assume only values of the form 
nao + ld (lan integer), and for brevity we write 


P(s, = nao + ld) = P,(1). 
Let 
el) = Die pet 


be the characteristic function of £; . Then, according to the rule of compo- 
sition, the characteristic function of s, is [y(t)]”, which we write briefly 
as ¢'(t). Thus, 


apes Plier? = e” (t). 
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The inversion formula (8), §3 thus gives 
ald 
(16) PAD = (a/2e) footer at (Se 2928): 
—nrid 


Our aim is to find convenient approximate expressions for the distribu- 
tion law P,,(1) and to estimate the associated error. Formula (16) will serve 
as a starting point. 

We assume that the distribution law p, of £; has finite moments up to 
the fifth order inclusive, i.e., that the series 


Lin» |i |p 


converges. As we saw in §3, this implies the existence of the derivatives of 
g(t) up to the fifth order inclusive. In particular, the mathematical expec- 
tation 


Il 
Q 


Et; = È T=» (a + ld)pı = —ip'(0) 
and the dispersion 
Di: = Ef? — (E)? = Dis (a + ld — a)'pı 
= —9”(0) + [p (0f =b 


of &; exist. 

Let us set nao + ld — na = u. (For given l the variable u represents the 
deviation of the value nay + ld of the random variable s, from its mathe- 
matical expectation, which is evidently equal to na.) In the present section 
we show that for n —> œ 


(17) Pa(l) = d(2anb) te?" + n(m + mu) + Ofn-*(n! + |u f), 


where my and m are constants (independent of n and u). Formula (17) is 
the form of the one-dimensional local limit theorem which we need. In the 
case of deviations u which are not too large, this formula gives a very ac- 
curate and at the same time very simple approximate estimate of the prob- 
ability P,(1). 

To prove (17) we first establish an auxiliary proposition, a refinement of 
Lemma 1, §3. 

Lemma 3. For —r/d < t < r/d we have 


le(t)| << EK 


where c is a positive constant. 

Proof. In virtue of our assumptions concerning ihe existence of the mo- 
ments of the variable £; , the function g(t) in some neighborhood of zero 
may be represented in the form 


g(t) = 1 + iat — 4f + O(I’), 
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where a = Et, y = EŻ. (For brevity, we omit the subscript from £.) Hence 
laD |? = (1 — gat)’ + a’ + O( ft") 
1 = Plg- a") +O) 
1- b? +0 |ti), 
where b = q — a’ > O is the dispersion of ¢. Hence, 
let) |? <1 — E 
with cı a positive constant. Let us set b/2c, = ø. For |t| < o we obtain 
let) |? < 1 — be + cot = 1-20? <e *’, 
Lo) | < ere 


But, for e < |t| < r/d we have, by Lemma 1, | (t) | < 1. Since the func- 
tion | g(t) | is continuous, a constant u < 1 can be found such that 


le(t)| <p 


foro < |t| < a/d. Determining the positive number ce from the equality 


(18) 


— „—clr/d)? 
u=e ? 


we have for o < |t| < 7/4, 
(19) | e(t) l <u= g7 (ria)? < oe’, 


Now if c denotes the smaller of the numbers b/4 and cə , then, in view of 
(18) and (19), we have in the whole interval (— r/d, +7/d), 


Jolt) | < ee". 


This proves Lemma 3. 

We now turn to the proof of (17). We divide this proof into steps so that 
it can be more easily comprehended. 

1. Let us choose an arbitrary number A, # < A < $, which we subse- 
quently consider to be constant, and let us put n™ = a, so that the quan- 
tity se approaches zero as n — œ. For constant r > 0,a > Oand for n > œ, 


æ 


w% 
fe” dt = (any f we” du 


o aini—d 


(20) š 
< (an nef we?” du = oln”), 
0 


where w is any positive constant. (The above follows since 
—y2 —y2 —u? =a 1-2 ~42/9 
e eae e u 2o u?/2 < e jan e a2 {2 


in the interval of integration.) Hence, by Lemma 3, 
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[ Pia baa A) a <f lelt) \" dt 
o <|t|<rld o <jt|<rld 


< 2 | eo" dt = o(n”); 


and consequently, by (16), 
(21) Pa) = (d/2n) | HOA doln) (— © <1< w). 


2. Formula (21) allows us to limit ourselves to the analysis of values of ¢ 
between —o and +c. In particular, the product n | ¢ |? < n™ can always 
be considered arbitrarily small because à > 4. 

Since, according to our assumption, (t) has a derivative of the fifth 
order at t = 0, 

g(t) = 1 + whe — WEP — gitee + srt Et + O(t). 
Therefore, because EE = a, EŻ = b+’, 
In g(t) = ita — bË + dest? + et’ + O( | t]°), 
where c; and c4 are real constants. Hence, 
(22) —it(na + u) + nIn g(t) 
= —itu — inb? + tent? + emt + O(n |t l"), 


where all the terms on the right side, starting with the third, are arbitrarily 
small for |t| < o =n". 
Since \ > 2, we have nt? = o(n|t|*) and consequently 


cos (eant) = 1 — tent? + O(n |tt"). 
Hence, in virtue of 
ert = 1 + ent! + O(n’) = 1 + ant + O(n |t 5), 
we obtain from (22) the following relation: 
ert (3) 
= @ 3h? eos enë + i sin cgnt®){1 + ent! + O(n | t|*)} 
= eP _ lent? + O(n | tl) + isin cnt} 
{1 + ent + O(n | t|*)} 
= oP + ené — bent + isin emn? — besem t” 


+ ient sin cnt + O(n | t|*)}. 
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But, because \ > 2 > 4, 
desen't” = O(n’ | tI"), 
and we find finally 
gt) = TP ent! = fetal’ 
+ isin cnt? + O(n|t|*) + O(n? lt. 


Because na + u = nao + ld, the left side of the above equation is the 
integrand of the integral (21). Therefore, we have obtained the required 
asymptotic expression for the integrand. 

3. Let us first estimate the integral 


(23) 


(d/2m) | ete nde dt 


of the main term of this formula. Using the well-known formula of Laplace: 


(L) (1/27) f eemi t = (anb) te?" 

we obtain 

(d/2r) | er emea dt = d(2anb) te" a (d/2r) giiir dt. 
i ii> 


Hence, by (20), 


(24) (4/27) Í Ee di = (Qn) te” + oln’), 


where w is any real number. 

4. Now let us turn to the estimate of the remainder terms. First, we re- 
place e * on the right side of (23) by cos ut — i sin ut and perform a 
term-by-term multiplication, putting cos ut = 1 + O(w?’). We retain 
only even terms in ¢ in the product since the odd terms vanish after integra- 
tion. This yields for the set of remainder terms on the right side of (23) 
the expression 

ee rent! — tent? + sin ut sin cynt? 
+ Olmet) + Omt) + O(n | t + O(n® | t|)} 
(25) = Plent — tent + uem? + O( | ul nt?) 


+ O( |u Pnt?) + O(wnt®) + Ont) 
+ O(n |t|?) + O(n? PN. 
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Now let us integrate. We integrate between the limits (— ©, œ) in all 
cases. In view of (20), the extension of the region of integration introduces 
an error only of order o(n ”) for the first three terms (where w is any real 
number) and only strengthens the estimates for the remaining terms. Since 
for any r > 0, 


(26) f pe e? dt = neta, 


where £, is a positive constant, integration of the first three terms of the 
expansion (25) yields an expression of the form 


n> (m + mu), 


with my and m, independent of n and u. 
After integration the remaining six terms of (25) yield, respectively, 


Olun), Olun”), O(n), 
O(wn), O(n), O(n”). 
Therefore, the sum of all the integrals obtained can be written as 
n(m + mu) + Ofn (nr? + |u f), 


with mo and m; constants independent of n and u. The integral of the set of 
remainder terms taken over the interval (—o, +c) differs from this ex- 
pression by a quantity of order o(n °). But since the integral of the princi- 
pal term is given by (24), we find 


(d/2n) Í gatid or) dt 
= d(2anb) he" 4 n (me + mu) + On + Juf?) 


where u = nay + ld — na, and mo , m, are independent of n and u. In virtue 
of (21) this yields 


(27) Pal) = d(2anb) te? + nme + mu) + O[n-i(n? + |u f), 


which we had intended to establish. 

The one-dimensional limit theorem expressed by (27) yields an estimate 
for the probability P,(1) which is accurate enough for all the applications 
needed in quantum statistics. However, a cruder (but simpler) estimate, 
which we now establish, is sufficient in the majority of cases. Returning to 
(23), we again substitute cos tu — i sin tu for e~ ™*, but this time we do not 
put cos tu = 1 + O(tu’). Instead, we simply note that cos tu is an even 
function of ¢ whose absolute value does not exceed unity. As before, this 
yields the principal term 


—itu—}łnbt? 
e à 
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Writing only terms in even powers of £, we obtain for the set of remainder 
terms the expression 
et” tent! cos tu — tefn t cos tu 
+ (sin tu) sin emë + Oln |t) + O(n? lt 
= e O(n) + O) + OC |u| nt’)}. 
After integration between the limits (— ©, œ), the three terms obtained 
yield, in view of (26), 
O(n), O(n), OC Ju m4). 

Consequently, as before, we find 
(28) P,(1) = d(2anb) te?" 4+ On + |u| I. 

This is the cruder estimate of the probability P,(1) which we shall sub- 
sequently need. 

In conclusion, we give a complete formulation of the one-dimensional 


limit theorem just proved. 
THEOREM 1. Let 


pi = P(E = a + ld) 


be the distribution law of the integral-valued random variable £ with a possible 
value ay , increment d, mathematical expectation a, and dispersion b. Let s,, be 
the sum of n mutually independent random variables each subject to the law 
pı and let 


P(s, = na + ld) = P,(1). 


If the law p; has finite moments up to the fifth order inclusive, then as n —> œ 
the following relations hold uniformly in l: 


(28) Pa) = d(2anb) te" + On + |u| I 

and 

(27) PD) = d(2anb) te???" + n(m + mu) + Omn + |u f), 
where u = nay + Id — na, and m , m are constants. 


§5. The two-dimensional limit theorem 


Let (&, m), (&&, n2), °°: (En, am) be mutually independent integral- 
valued random vectors subject. to the same distribution law pr: with the 
maximal lattice 


a + ka + Je, bo + ky + lô, ad — By = d £0, 
so that 
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Per = P(E; = ao + ka + lb, ni = bo + ky + lô) 
(l<i<n;-x <k <%; =% <1 < w). 


Let us put 


yd É: = Sn ; yy Ni = Ta . 


Obviously, the only possible points for the random vector (Sn, Ta) are 
pairs of numbers of the form 


(29) Sn = nd + ka + l8, Tn = nbo + ky + lô, 


which comprise a lattice of the vector (S, , Ta) if k and | range through all 
the integers. For brevity, let us put 


P(S, = nao + ka + 18, Ta = nbo + ky + lô) = P,(k, D). 
The characteristic function of the vector (£;, n:) is 


‘Lap -+tha-+is) t-+(bo+ky +18) 9] 
gli, s) = Prio pre re 


According to the composition rule, the characteristic function of the vector 
(Sn , Tn) is [p(t, s)]” which we write briefly as ¢"(t, s). Hence, 


nee P,(k, De eee == e(t, s). 


Therefore, the inversion formula (11) of §3 yields 
(30) P,(k, 1) = (| d|/4n”) Tf a REONE hia EETE REN Ry s) dt ds, 
D 


where the region D is defined by the inequalities 
(D) lat + ys| <r, | Bt + ôs| < r. 


[We did not investigate the question of the maximality of the lattice (29) 
which we introduced for the vector (Sn , Ta). However, this is not neces- 
sary since the inversion formula (11) holds for any lattice, whether maxi- 
mal or not, provided that the lattice includes all possible points of the 
vector.] 

We now wish to establish a convenient approximate expression for the 
probability P,(k, l) and to find an accurate estimate of the error incurred 
in the approximation. 

We assume that the initial distribution p, is not degenerate; hence, ac- 
cording to §2 it follows that the distribution has a maximal lattice. (This, 
however, we had already assumed at the beginning of this section.) Fur- 
thermore, as in the one-dimensional case, we assume that £; and 7; have 
finite moments up to the fifth order inclusive, which is equivalent to as- 
suming the convergence of the double series 
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Drio (|k Ë + IE) pe. 


The last assumption guarantecs the existence of all the partial derivatives 
up to the fifth order inclusive of the characteristic function (t, s) at the 
point t = s = 0. In particular, this guarantees the existence of the mathe- 
matical expectations 


a, = EE = —i(ðp/ðt) == , a, = En = —1(d9/08) 1-2-0 , 
of the dispersions 
bu = El(E — a)"] = [(d¢/at)” — 8/3 hi0, 
bz = El(n — a2)"] = [(d¢/d8)” — 0°9/A8"]-1-0, 
and of the “mixed” second order moment 
bi = (bube) Riz = EKE — a1)(n — a)| 
= [(dp/dt)(dv/ds) — d'y/dt As|ie-o « 


Hence, because of the assumed non-degeneracy of the vector (£;, :), it 
follows from Schwarz’s inequality that the correlation coefficient Riz of the 
quantities (£;, 9:) is different from +1, i.e., 


| b| < (buba), 


or 
A= birba a bie > 0. 


The mathematical expectations of the quantities S, and T, are na, and 
na , respectively. The deviations of the values nay + ka + 18, nbo + ky + 
(ê of these quantities from their mathematical expectations are therefore 


nao + ka + IB — na = u, nba + ky + 16 — na: = w, 


respectively. 
The purpose of the present section is to establish the validity as n > œ 
of the following asymptotic expansion: 


P,(k 1) = |d] (Qandt) ere brzu? —2biur utdi ue?) 
(31) í 
+ n°(m + mu + mu) + Onè + |u? + | we DI, 


where mo, mı , m, are constants (independent of n, u, and uz). Formula 
(31) is the most accurate form of the two-dimensional limit theorem that 
we shall need. 

First, we prove an auxiliary proposition which permits us to extend 
Lemma 3 of the preceding section to the two-dimensional case (and, simul- 
taneously, to make Lemma 2 of §3 more precise). 


38 THEORY OF PROBABILITY [CH. 1 
Lemma 4. There exists a constant c > 0 such that for any point (t, s) of 
the region D, 
| o(t, s) | < e Et, 


Proof. Because of our hypotheses, the function ¢(t, s) can be represented 
in some neighborhood of the origin in the form 


y(t, s) = 1 + ilat + as) — (ant + 2ayts + ans’) + OCL + Isf’), 


where the numbers a; , @2 , @1 , %2 , ax are first and second moments of the 
random vector (E; , n:). Whence, 


lelt, 8) |? = [1 — (au? + 2ants + ax)? + (at + ars)” 
+0(\t + jsi) 
= 1 — (ant? + 2ayts + ans’) + (at + as)? 
+OP HIS À 
= 1 — [(an — af) + 2(ae2 — mais + (an — ar )s'] 
+ O( | t/? + |s|’). 


2 2 
Here ay, — a: = by, Qe — ad, = dy, Gee — Ge = be are the second 
order central moments of the vector (£:, 7:) which we discussed above. 
Thus, 


lelt, s) |7 = 1 — (but? + 2bts + bas’) + OC | t |? + sf). 


But, we saw above that A = bnbz — biz > 0 [non-degeneracy of the vector 
(£: , 1:)]. Therefore, the form byt’ + 2biefs + bas? is positive definite (since 
bi > 0, be > 0). This means that there exists a positive number c, such 
that 








but + Qbets + bars” > a(t? + 8°) 
identically. Therefore, in some neighborhood of the origin, 
lelt, s) |? < 1 = e(Ê +s) tal ltl +s), 


where C2 is another positive constant. If ¢ < ¢1/2c: and |t| < o, |s| <ø, 
then 


e(t +[ sl’) < ealt + s). 
Hence, 
lelt, S) |? < 1 talt +s) < oe, 
or 
(32) Jolt s) | < Rt (Jt] <o, |s| <o). 
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Let us denote by Q the square |t| < o, |s| < o of the (t, s)-plane. In 
view of Lemma 2, we have | ¢(¢, s)| < 1 everywhere within the (closed) 
region D — Q. Since the function | (t, s) | is continuous, a number u < 1 
exists such that | (t, s) | < u everywhere in the region D — Q. If we de- 
note by A the largest value of the sum ¢ + s’ in the region D and choose 
ca > 0 sufficiently small so that 


—c3À 
<e a 


then 
(33) lolt, s)| < p < i < ee) 


everywhere in the region D — Q. Finally, if we denote by c the smaller of 
the numbers 4c; and c; , then in view of (32) and (33), we have 


lolt, s) | See 


both in the region Q and in the region D — Q and hence in the whole region 
D. This proves Lemma 4. 

Turning now to the proof of the asymptotic formula (31), we again 
divide our discussion into steps corresponding to those of the preceding 
section in which we proved the one-dimensional limit theorem. 

1. We again denote by à an arbitrary constant lying between Ẹ and 3 
and put n™ = o. 

The double integral in (30) extends over the region D defined by the 
inequalities (D). We divide this region into two parts, one being the square 
Q: |t| <e, |s| < ø, and the other the complement D — Q of the square 
Q in the original region D. In the region D — Q at least one of the numbers 
|t|, |s| exceeds « and so by Lemma 4, | g(t, s) | < e ” . Hence we find 
(taking into account that the area of the region D is 4n’/ |d |) 


I ene er er ey re s) dids < (4% /|d|)eor = on™”), 
—Q 





with w any real number. In view of (30), this yields for — œ < k,l < œ, 
P,(k 1) = (| d |/4n”) i Pa eae R 6 s) dt ds 
(34) @ 
+ o(n™), 
in complete analogy with (21) of the preceding section. 
2. Since the function g(t, s) has partial derivatives up to the fifth order 


inclusive at t = s = 0, we can (by assuming that |t| < o, |s| < o) write 
the expansion 


In g(t, s) = iP,(t, s) — 4P2(t, s) + iPa(t, s) + P(t, s) + O(7°), 
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where r = |s| + |¢| and P,(t, s) denotes a homogeneous real polynomial 
of degree r in t and s. In particular, 
P,(t, s) = at + as, 
Palt, s) = bul? + 2Wyots + bres’, 
and for any 7, 
P(t, s) = O(7’). 
Hence we obtain for the logarithm of the integrand in (34), 
—i(na, + w)t — ilna, + uz)s + n In g(t, s) = —ilut + ups) 
— 4n(but’ + bets + bas”) + inPs + nP, + O(nr’). 
Since \ > 3, we have nr? = o(nr’) and consequently 
cos n”; = 1 — 3n°P; + O(n'r”) = 1 — wP? + O(nr’). 
Since it also follows that 
e"™ = 1 + nP, + O(n’) = 14 aP, + Olr), 
we find 
g liner te ttinertundl acy g) 
= geaten Poloos nB; + i sin nP3J[l + nPy + O(nr°)] 
= gitta- L P? + Oln) + isinnP all + nP, + O(n’) 
= g Uttu in Pany + nP, — in? Pè + isin nP; — in PènP, 
+ nË i sin nÊ, + O(n7*)). 
But here, since \ > 32 > 2, 
in?PynP, = O(n'r”®) = O(n’), = nPasin nP; = O(n’1'), 
and we find, finally, 
gM naotutr(nagtussl mcg g) 
(35) = e ettu A nP, — mP? — i sin nË 
+ O(nr*) + O(n's’)). 


This asymptotic expression for the integrand of (34) is completely analogous 
to (23) of the preceding section. 
3. First let us estimate the integral 


(| d]/4r°) f i gitu: ay ds 
o 
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of the main term of (35). Since 
wut + ws = ult + (bs/bun))] + sluz — (bru/bun)) 
and 
P(t, s) = bul? + 2byts + bos = bulé + (bi8s/bn)}? + As/bn , 


we have 


(| d |/4n°) Ir git tuzs) in Palts) dt ds 


(| d |\/4n°) [ glu (1 aur/b1 11-4 ( 497/011) ds 
Io 


0 
f gte t+ @128/b11)] ndr lH 28/01 ))? dt 
— ao 


It 


a 
(Ld 1/21) | eeoa gy 
— 


-(1/2z) f eiri? dy, 
Now let us apply formula (L) of the preceding section to both of the 
integrals on the right side. This gives for the first integral 
| d | bi (rna) te 0?" [lua— bigion Mbr 
and for the second 
(Qanby) eten, 


Thus, we find 
a © = 
(ld |/4r°) ll [ grilurttuss)—inPo(tis) oy dg 
— a0 V— o 
=> (| d | /2and*) (2m A) (by 1422261 9 uy watbe2 m1 2) 


This is precisely the leading term of the right side of (31). However, we 
integrated over the whole plane (— œ < t, s < œ) instead of over the 
square Q and we must therefore estimate the error incurred. To do this 
we note, for example, that 

i ds [ e dt = r etta ds i gee iy 


œO w% 
—n ås ?/2b —inb 2 = 
= | cme ae fee du = o(n™) 
oc — ap 


for all real w since (20) of §4 gives this estimate for the first factor, and 
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the second factor approaches zero as n — ©. Hence we can consider 
4 2 I —iluj ttus) in Pa(t,s) d 
(36) (|d|/4x’) ne dt ds 


= (| d | /2ena!) 7 (tl 2n 8) (Or ue? 2b, guy ua tbe guy?) hs oln”) 


as established and can proceed to estimate the remaining terms. 

4. For this purpose we replace e *“!'*’?” in the right side of (35) by 
cos (wt + ws) — i sin (ut + ues), in complete analogy with the compu- 
tations of the previous section. Then we multiply term by term using the 
fact that cos (wt + ws) = 1 + Our’), where u = |u | + | w |. Asin 
the one-dimensional case, we retain only terms of even order in the vari- 
ables t, s since the odd terms drop out after integration over the square Q. 
This gives for the set of remaining terms the expression 


e>" nP, — 1n?Py — sin nP; sin (wt + ws) 
+ Owns’) + O(unr®) + Olr) + O(n7*)] 
= e nP, — InP? + nPy(ut + ws) + Olun?” 
+ Olun") + O(wn's*) + O(wnr®) + Onr) + Oln). 


Now we must integrate the expression (37) over the region Q. We can 
integrate the first three terms over the whole plane (—œ <i, s < œ) 
instead of over the region Q and incur thereby an error only of order n ”, 
where w is any real number. In order to see this, it is evidently sufficient 
to show that 


(37) 


I(p,q) = p s’ ds [ Pe? dt = oln”) 
for all integers p, q > 0. But, putting t + (bs/bu) = z, we evidently get 
P, = (As’/bu) + buz. 
Hence, 
I(p,q) = B ste MA ds i [z — (by8/bn))? © dz. 
Expanding [z — (bs/bu)]’ by the binomial formula, we see (with no 


computation) that the integral I(p, q) is represented as a sum of products 
of the form 


kd œ% 
f sre reen ds [ og inoue? dz, 
L w% 
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where as n increases the first factor has the form o(n °) by (20) of the 
previous section, and the second factor in any case remains bounded. This 
shows that I(u,v) = o(n™®). 

In regard to the remaining terms, it is clear that extending the region 
of integration only strengthens the inequality obtained. Consequently, we 
can also integrate these terms over the whole (¢, s)-plane instead of over 
the region Q. 

For the homogeneous polynomial P,(t, s) of degree r, the transformation 
in? = t', sn? = s’ yields 


f Í P,(t, s)e 2 dt ds = nF *%e, 
with c independent of n. Therefore, integrating the first three terms of the 
expansion (37) over the whole (t, s)-plane yields an expression of the form 
na (mo + mu, + mauw), 


where mo , mı , Mm: are independent of n, u, and us. 
After integration we obtain the following estimates for the last six terms 
of (37): 


O(un™), O(u'n™), O(wn™), O(w’n™), O(n), O(n’). 
Hence, after integration the whole set of remainder terms yields 
n (mo + mu + mu) + O[n*(n? +u’). 
Combining this with the integral (36) of the main term, we find 


(| d 1/4) [f geet tien inte terrae) 8 y s) di ds 
Q 
= (id | /2 Ab) 9 (1/204) i ua?—2b1 aug Hb2901 7) 


+ n° (m + mun + mu) + O[n*(n? + u’), 


where mo, Mı , m denote numbers independent of n, u; and uz. In view 
of (34), we then have 


P,(k, 1) = ( |d] / Qn!) e 1/274) br iua?—2brgui ue Hru?) 
+ n (mo + mu + mu) + O[n*(n} + u’)). 


This is the formula we wished to establish. 

All that remains is for us to establish, in analogy with (28) of the pre- 
ceding section, a cruder estimate for P,,(k, 1). This estimate will be com- 
pletely adequate for many cases. In analogy with §4 we replace the factor 


—i(uytt+ugs) > )- : : . . 
e *1'P2) in (35) by its trigonometric expression. We do not however re- 


(38) 
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place cos (ut + ws) by 1 + O(u’r’) as we did above. We simply use the 
fact that cos (ut + wes) is an even function of £ and s whose absolute value 
does not exceed unity. The remainder of the derivation is so closely anal- 
ogous to the one-dimensional case that we need not repeat it here. We 
only cite the final result, which is 


P,Ak, 1) 2 ( | d | /Qarn.A*) rA Orrwa?—2rauius toons) 
+ O[n7(1 + u)). 


In conclusion, we give the complete formulation of the two-dimensional 
limit theorem just proved. 
THEOREM 2. Let 


Dar = P(E = a + ka + 18, n = bo + ky + lô) 


be the distribution of a non-degenerate integral-valued random vector (E, 7) 
with the maximal lattice ao + ka + IB, bo + ky + lô, the mathematical ex- 
pectations a, az, the dispersions bu, be, and the correlation coefficient 
bjo( bites). *. Let (Sn, Tn) be the sum of n mutually independent random 
vectors each of which obeys the law pz. and let 


P,(k, D = P(S, = nao + ka + 18, Ta = nbo + ky + lô). 


If finite moments up to the fifth order inclusive exist for the law px , then as 
n — © the following relations hold uniformly in k and l: 


(39) 


P,(k,l) = ( |d | /2mnAt je nA Crate? 1 gurus tbu?) 
(39) j 
+ Oln (1 + u)), 
and 
Pilkey 1) = (|d [dene ee aerate 
(38) 


+n? (mo + mu + mu) + O[n*(n? +u’), 


where d = aô — By #0, A = bub — bi > 0, u, = na + ka + IB — na, 
uz = nbo + ky + lô — naz, u = |u | + |u: |, and where mo, m , ms denote 
constants (independent of n, u and ue). 


Chapter II 


PRELIMINARY CONCEPTS OF QUANTUM 
MECHANICS 


§1. Description of the state of a physical system in quantum mechanics 


In classical mechanics the state of a system with s degrees of freedom is 
conveniently described by giving the values of all 2s “Hamiltonian vari- 
ables”: the “generalized coordinates” qı , q2, *** , qs and the “generalized 
momenta” pi, P2, °°:, Ps. Thus, in classical mechanics the state of a 
system can be represented by a point in its “phase space”, which is a 
Euclidean space of 2s dimensions with rectangular coordinates qı , q2, °°: , 
Qs; Pi, P2,°°* » Po | It is well-known that such a representation of the state 
of a system is particularly convenient for the purposes of statistical me- 
chanics. Each physical quantity related to the system is a certain function 
F(q, +, Qs, Pry +t, Ds) Of its Hamiltonian variables with a definite 
value for each definite state of the system. In other words, such a quantity 
is a single-valued function defined on the whole phase space. 

In quantum mechanics the problem of characterizing the state of a phys- 
ical system is solved in an entirely different way. The state of the system is 
described by giving a certain (in general, complex) function 


U = Ula, 45:05) 


of the generalized coordinates of the system alone. (It is common practice 
to call such a function the “wave function” of the system.) A knowledge of 
the function U describing the state of the system does not in general make 
it possible to determine uniquely any of the Hamiltonian variables qı, 
***,Qs, Pr, ‘°°, Ds. In the most favorable case, knowing the function U, 
we can determine the values of some of these variables, but all 2s variables 
can never be uniquely determined. The reason is that in quantum mechanics 
the quantities g; and p; (for the same value of 7) cannot both have definite 
values in any single state. Of course it follows that any physical quantity 
F(q,°++* 54, Pi, °** 5 Ds) does not in general have a single definite value 
in the state described by the function U (or, as we say briefly, in the 
state U). 

In quantum mechanies, prescribing the function U determines the dis- 
tribution laws of physical quantities, but not their values. Thus, knowing 
the function U, we can calculate the probability that any physical quantity 
A have a value included in a given interval (a, b), i.e., the probability of the 
inequality a < A < b. Only in the exceptional case, when the distribution 
law found for the quantity % is such that A has only one possible value, 
can it be said that Y has a definite value in the state U. 
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In particular, if the system is in a state U, the probability that the values 
of qı, °*:, qs belong to a given region V of the configuration space (i.e., 
the Euclidean space of s dimensions with rectangular coordinates qi, ++: , 
q:) is given by 


[1U Paa- aa / f 1U P ag da, 
Vv 


where the integral in the denominator is taken over the whole configuration 
space. In the even more special case of a system consisting of an elementary 
particle with the simplest Hamiltonian variables x, Y, Z, Pz, py, Pz, the 
probability that this particle belong to the region V of the ordinary three- 
dimensional space is expressed by 


[ (Ue y, 2) ae ay dz / f | Ula, y, 2) de dy de, 


where the integral in the denominator is taken over the whole three-dimen- 
sional space. 

Since the function U determines the distribution law of a physical 
quantity A, it is clear that U also determines uniquely the mathematical 
expectation EM of this quantity. In particular, if 


A = F(q, +++ 5 Qe) 


is any function of the generalized coordinates qı , --- , qs , then 


GQ) EAX = fEl, a) 1U Pda da f f 1U P da -++ do, 


where both integrals are taken over the whole configuration space. 

All the formulas we have presented show that replacing the function U 
by a function AU, where \ = 0 is an arbitrary complex constant, does not 
produce any changes in the statistics of those physical quantities which 
depend only on the generalized coordinates qı, -+> , qs (i.e., that do not 
depend on the generalized momenta pı, --- , Ps). We shall see later that 
this same rule also remains valid for physical quantities of any sort. Thus, 
we must regard the functions U and \U as describing the same state of the 
system, whatever may be the value of the complex constant A = 0. Making 
use of this freedom in the choice of \, we can obviously describe any state 
of the system by a wave function U for which 


J 1U Pag- ag =1. 


We then refer to the function U as normalized. For normalized functions 
U all the formulas we have written are simplified, since their denominators 
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become unity. If the function U describing a state of the system is a nor- 
malized function and \ = e” is an arbitrary complex constant with absolute 
value 1, then XU is also normalized and describes the same state of the 
system. 

It is obvious that the use of the function U to establish the statistics of 
physical quantities, in the way we have been describing, requires the con- 
vergence of all the integrals used in this process. In particular, | U |? must 
be integrable over the whole configuration space (that is, U must be an 
element of a complex Hilbert space). If this requirement is satisfied by 
U, and U2, then the integral 


f VU dq «++ dq. 


(where the star indicates complex conjugation) taken over the whole 
configuration space is absolutely convergent. We call this integral the 
scalar (inner) product of U, and U:, and write it briefly as (U1, U2). In 
particular, it follows that 


(U, U) = J 1U} dn- do. 
Hence, if U is normalized, 
(U, U) = 1. 

Scalar products obviously have the following elementary properties: 
1. (Ui + U2, U:) = (U1, U3) + (U2, U3), 

(Ui, U: + U3) = (U1, U2) + (U1, Us). 
2. If X is an arbitrary complex number, then 

(AU, , U2) = MUL, U2), 

(U,, U2) = A*(U1, Us). 


3. (U2, Ui) = (Ui, U2)*. 

One of the most important principles of quantum mechanics is the so- 
called principle of superposition: If the functions Ui(q, +++, gs) and 
U2(qi, ++ , Qs) describe possible states of the system, then the function 
U = MNU + AU2, where 1, M are arbitrary complex constants and 
| >. | + | A2| > 0, describes a possible state of the system. This principle is 
automatically extended to the case of any number of components. It can 
also be extended to infinite series: If the series 


re NU 


converges in the mean to the function U [i.e., if (U — Sa, U — Sn) ~0 
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as n — œ, where S, = pees AU], and if each of the functions U, de- 
scribes a possible state of the system, then so does U. In all cases the only 
exception is that of a sum identically equal to zero: The function U = 0 
does not describe any state of the system; it cannot be normalized and all 
the formulas we have given above for probabilities and mathematical 
expectations become meaningless for U = 0. 

A number of cases can be found in the history of statistical physics where 
an imprecisely defined concept of the probability of an event has led to 
confusion and to misunderstanding, and has thus hindered the development 
of science. In physics, as in all applied sciences, the probability of an event. 
always means the relative frequency of its occurrence. The same event, 
however, can have very different probabilities under different conditions. 
Therefore, any assertion regarding probabilities has a precise meaning only 
when the conditions under which it applies are stated precisely. 

In quantum mechanics, as we have already seen in part and shall see in 
considerably more detail below, the most important assertions are of the 
form: “If the system is in a state U and & is a quantity related to this 
system, then the probability that a < A < b is equal to the number p.” 
To avoid confusion and ambiguity it is therefore necessary to first explain 
the meaning of such an assertion with complete precision. Suppose we have 
a large number n of systems of the type considered. Let all these systems 
exist in the state U and suppose that on each of these systems we carry out 
a measurement of the quantity A. We assume that when this is done the 
value of % lies between the numbers a and b for m systems, and that for the 
other n — m systems the value of % lies outside the interval (a, b). The 
meaning of our probabilistic assertion is that if n is a large number, the 
ratio m/n is close to p. 

This statistical meaning holds in general for all the probabilistic asser- 
tions of quantum mechanics. A very important feature of such a statistical 
prediction is that its only underlying assumption is the requirement that 
each of the n systems should be in the state U at the moment of measure- 
ment. All other conditions (which in various cases can be of very different 
sorts) have no influence at all on the probability of the event a < A < b. 
It is thus completely immaterial whether or not all the n measurements are 
carried out at the same place and at the same time or whether they are 
carried out at places extremely remote from each other both spatially 
and temporally. It is also completely immaterial when and in what manner 
each of the given systems has been brought into the state U. In particular, 
we can carry out all n measurements on a single system. All that is neces- 
sary is that after one measurement has been carried out the system must 
again be brought into the state U before the following measurement is 
made. 


§2] PHYSICAL QUANTITIES AND SELF-ADJOINT LINEAR OPERATORS 49 


The fact that the statistical predictions of quantum mechanics are inde- 
pendent of all supplementary conditions is one of the most important gen- 
eral principles of the theory and gives the predictions a real statistical 
meaning. 


§2. Physical quantities and self-adjoint linear operators 


In §1 we saw the manner in which quantum mechanics allows us to define, 
for each state U, the statistics of a quantity A which depends only on the 
generalized coordinates qı, : -+ , q: of the system. But, in general, a physical 
quantity A depends on the whole set of Hamiltonian variables q@ , ++ , Qs, 
pi, *** , Ps, and we now have to formulate the rules which will permit us 
to define the statistics of such a quantity for any state U of the system. 
For this purpose we use the concept of a linear self-adjoint operator. 

An operator @ is a rule which assigns to each element U of the complex 
Hilbert space a definite element @U of the same space. (Sometimes @ is 
not defined for all U but only for a subset of the set of all states.) An 
operator @ is said to be linear if 


a(AU, + AU) = AU, + MaU» 


for arbitrary U, , U: and all complex constants M , àz . To each operator @ 
there corresponds a definite complex adjoint operator @* defined by the 
condition 


(QU; , Us) = (U1, @*U2). 
An operator @ is said to be self-adjoint (or Hermitean) if 
(2) (QU, , Us) = (Ui, QU?) 


for all U;, ; U: Š 

In quantum mechanics a linear self-adjoint operator Q is assigned to each 
physical quantity A. The statistics of the quantity are established for each state 
U of the system by means of this operator. The manner in which this is done 
will be stated below. At present, we note to begin with that if the quantity 
X is a function of just the generalized coordinates qu, +-+ , qs, then the 
operator @ assigned to it is defined simply as multiplication by this func- 
tion: If 


A F(q, t34), 


ll 


then 
aun, <a 4s) = F(a, T 4) Uq, 3a +s). 


The linearity of this operator is easily verified. Moreover, if the function F 
is real, then 


50 QUANTUM MECHANICS [cH. ll 
(@Ui, U2) = | (FU:)U* du --- dgs 


= / U\(FU2)* dq ++ dq: = (U1, @U2). 


This shows that a real physical quantity of the form A = F(q, °, q) 
always has a definite self-adjoint operator corresponding to it. 
We now consider the general case of quantities of the form 
A = F(q, taas, Pry t , Ps). 


The way in which the self-adjoint operator corresponding to sucha quantity 
is defined will not interest us in this book. Therefore, we shall give only a 
brief description of it. First, we consider the simplest case X = p (1 < 
k < s). We choose the operator 


= —th ð/ðqk 
to correspond to the generalized momentum p; , i.e., we set 
E,U = —th 0U/dq. , 


where ž is Planck’s constant divided by 2r. This operator is clearly linear. 
To prove that it is self-adjoint we note that 


(@.U1, U2) = (—ih 8Ui/dq , U2) 


-i f (8U1/3q)U* dqı oy dq. 
But for fixed qı , °++ , Qk, Getty °t 9 Ue 5 


a (8Us/dq.) Ut dgs = UU 





le -f U(ðU:*/ðq) dqr 


=a [ Ui(dU2*/dqx) dx 


if, as we assume, the functions U, , Uz vanish for q = — œ and qm = œ. 
Therefore, 


(Ui, Ur) = iñ f Us(aUst/aq.) dan +++ dg, 


f U:( —iñ 0U2/dq.)* dq +- dq: 


= (Ui, & U2), 


as was to be proved. It is worthwhile pointing out that the self-adjoint 
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property of the operator ®, depends in an essential way on the presence of 
the imaginary unit as a factor in the definition of this operator. The real 
operator ñ ð/ðqı does not have this property of being self-adjoint, as can 
readily be verified. 
If the physical quantity % is an arbitrary function 
X = F(q, ttt Qs, Pi; t , Ps) 

of the Hamiltonian variables of the system, the idea naturally arises of 
defining the corresponding operator @ by means of the formula 

= F(Q, nar a Nes Ory ag @:), 


where ©, , ® denote, respectively, the operators assigned to the quantities 
qr, Pe (1 < k < s). This is indeed the path usually followed in quantum 
mechanics. This idea, which on the whole has been extremely fruitful, 
encounters a practical difficulty in regard to interpreting the operator 
(3) F(Q, +++, Qe, Pr, °°: , Or). 


The solution to this problem is by no means obvious even in the very 
simple case of F a polynomial with constant coefficients. To form the poly- 
nomial (3) it is necessary to define the procedures for adding and multi- 
plying operators, 1.e., it is necessary to formulate an algebra of operators. 
This algebra is constructed as follows: If @ and @ are operators, and a, 8 
are real numbers, then the operator 


C = a8 + BB 
is the operator defined by 
CU = a@U + BRU. 


If Q and @ are linear self-adjoint operators, € will be an operator of the 
same kind. We call the operator @ defined by 


CU = @(BU) 


the product QG of the operators @, ®. Hence the result of applying GG is 
the same as the result of successive application to the same function of 
first @ and then @. Multiplication of operators is in general not commuta- 
tive: @@U is not in general the same as @@U. For example, 


OQU = —th A(Ugqe)/dq. = —th[U + (q 8U/dqx)], 
while 
QPU = q(— ih 0U/dq) = — tha. aU /dq. . 


Due to the non-commutativity of multiplication for operators, the meaning 
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of (3) is ambiguous, even in the case of a polynomial. This raises a diffi- 
culty in the choice of the operator corresponding to a given physical 
quantity. The product of two self-adjoint operators will be a self-adjoint 
operator only if the two factors commute with each other. We note also 
that since each operator obviously commutes with itself, a positive integral 
power of a self-adjoint operator is always a self-adjoint operator. 

Thus we see that the problem of assigning an operator to a given physical 
quantity Ñ cannot always be solved by simple general methods. In many 
cases a special physical consideration is required. But, for the purposes of 
this book such difficulties are of no importance. Essentially, the only im- 
portant fact for us is the possibility of assigning to each physical quantity 
a self-adjoint operator which will permit us to determine the statistics of 
the quantity in any state of the system in a unique and in principle very 
simple manner. We shall now consider the question of how these statistics 
are established once the operator for the given quantity is known. 

When the physical quantity 


A = F(u, t, Qs) 


depends only on the generalized coordinates qı , --- , gs , its mathematical 
expectation E% in the state U is given, as we know [§1, (1)], by 


Ev = f Eia, erry) (Ulgi, t g) |? dg +++ dgs. 
For simplicity, we have taken the function U to be normalized. Since the 
operator @ defined by 
aU = F(q, Te , 4s) U 


corresponds in this case to the quantity A, we may write 


EvA = f FUU* dq ++- dg, 


> f (QU)U* dq «++ dge = (QU, U). 


Extending this result, which we have established for quantities X of a 
very special type, to the general case yields one of the basic principles of 
quantum mechanics: Jf the linear self-adjoint operator Q corresponds to the 
physical quantity A, then the mathematical expectation of A in the state U is 
equal to 


(4) E,% = (QU, U). 


[The function U is supposed normalized. In the general case obviously 
En = (QU, U)/(U, U).] 
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This simple general rule contains the true meaning of the “assignment” 
to each physical quantity of a definite linear self-adjoint operator. We see 
that when we know the operator corresponding to a given quantity, we can 
actually determine by a single simple principle the mathematical expecta- 
tion of the quantity in any state U. 

It might be thought that the result we have found determines only the 
mathematical expectation of the quantity A, but does not determine its 
distribution law. This is actually true if we know only the operator cor- 
responding to N. But in principle the method partly described above also 
makes it possible to determine the complete statistics of physical quantitics 
related to the given system. Indeed, if we wish, for example, to find the 
probability of inequalities such as 


a<%U<b, 


where a, b are real numbers; and if 


X = F(q, ta, Pi, °"° , Ds); 
then it is sufficient to consider another physical quantity X defined by 


goli ifa<F <p, 
~ \0 in all other cases. 


This quantity is obviously also a function of the variables q, --- , qa, 
Pı , ++- , ps and therefore can be regarded as a physical quantity related to 
the given system. On the other hand, it is obvious that the probability of 
the inequality a < A < b, when the system is in the state U, is equal to 


Pula < Y < b) = Ev&. 


If the linear self-adjoint operator @ corresponds to the quantity H, then 
we have by (4) 


Po(a < A < b) = (BU, U) 


(where again for simplicity we suppose the function U to be normalized). 
Thus if we know how to assign a self-adjoint operator to the quantity 9 
and also how to assign operators to each %, then (4) makes it possible 
to find the complete distribution law of 2. 
Furthermore, we note that because of the self-adjoint property of the 
operator Q, we always have 


(aU, U) = (U, ev). 


On the other hand, because of the general property of scalar products (§1, 
property 3), 


(U, QU) = (QU, U)*. 
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Thus we always have 
(aU, U) = (aU, U)*, 


i.e., the mathematical expectation of a physical quantity is always real. 
This result shows clearly the meaning of the requirement that all operators 
which are assigned to physical quantities be self-adjoint. We shall find 
below even more decisive arguments to support this requirement. 


§3. Possible values of physical quantities 


Suppose that a physical quantity A has a linear self-adjoint operator @ 
assigned to it. In the preceding section we saw how, knowing this operator 
@ and the operators corresponding to various functions of the quantity M, 
one can construct the distribution law of this quantity in any state U of 
the system. Sometimes this law can be such that the quantity X for the 
state U takes on only one possible value œ (with probability 1). In that 
case we say that Ñ has a definite value in the state U. Then 


Ev wt = a, 


We now consider the conditions under which this situation can occur. In 
order that a random variable have a unique possible value, it is necessary 
and sufficient that its dispersion be zero. The dispersion of the quantity 2 
in the state U is 


Do = E(N — a)”. 


On the basis of the principles explained in the preceding section, we nat- 
urally conclude that the quantity (X — a)’ has a (linear self-adjoint) 
operator (@ — a)? corresponding to it. Hence if the function U is nor- 
malized, 


DA = ([@ — a] U, U). 
In virtue of the self-adjoint property of @ — a@ this gives 
DuA = ([@ — alU, [@ — alU) 


= f (la — alU)(l@ — alU)* dp +-+ do, 


= f ile — alU da +- da.. 


If DX = 0, then because of the non-negative character of the integrand, 
we must have 


|l@- aU} =0 
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identically in qı, --- , qs . Consequently, 
[& — a]U = 0, 
(5) 
QU = aU 
for arbitrary qı , -++ , qs . Accordingly, (5) is a necessary condition that % 


have a definite value a in the state U. This is also a sufficient condition, as 
can be easily seen by carrying out all the calculations in the reverse order. 

The relation (5) shows that the action of @ on U leads to the multiplica- 
tion of U by a real constant a. In the theory of operators, a function U (not 
identically zero) related to an operator @ by an equation of the form (5) 
‘is called an eigenfunction of the operator belonging to the eigenvalue a. The 
numbers which can occur as definite values of the quantity % are the eigen- 
values of the operator Q and the state in which the quantity A has a definite 
value a is described by one of the eigenfunctions U of the operator Q belonging 
to the eigenvalue a. 

If we consider (5) as an equation defining U, then in typical quantum- 
mechanical cases this will be a partial differential equation, since the 
operator @ is formed from the operators Q: , @, in about the same way as 
the quantity A is formed from the variables q; , pa (k = 1, ---, s). The de- 
sired solutions U of this equation must be uniquely determined in the entire 
space of the variables qı , --- , gs , and the integral 


[iu an ++ aa 


must exist when taken over the entire space. Moreover, these functions 
must satisfy certain other general requirements: They must be continuous 
and sometimes they must possess partial derivatives to some order. Despite 
the seeming broadness of these requirements, in general it turns out that 
solutions satisfying them do not exist for all values of the parameter a. 
Those values of a for which such solutions exist are eigenvalues of the 
operator @. The set of eigenvalues of @ is called the spectrum of @. The 
spectrum of an operator can be either discrete (i.e., consisting of single 
isolated numbers) or continuous (i.e., comprising a whole segment. or even 
the whole continuum of real numbers). Cases of combined spectra are also 
possible, i.e., spectra which in some places are continuous and in other 
places discrete. For the purposes of this book, we need only consider the 
case in which the (discrete) spectrum of an operator consists of a sequence 
of numbers approaching infinity: 


(6) a < ag Kes Sane (limys0 an = œ). 


Suppose that in a certain measurement of the quantity A we have ob- 
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tained the value a. According to quantum mechanics, immediately after 
this measurement the system will be in a state such that the quantity 
has « as its definite value. Hence a second measurement of Y on the same 
system, provided it follows immediately after the first, will necessarily 
have for its result the value a. Therefore, any number a which can ever 
appear as the result of a measurement of the quantity X (in any state what- 
ever) must also be (for some particular state) a definite value of this 
quantity and consequently must belong to the spectrum of the operator Q. 
This spectrum accordingly contains all numbers which can occur as results 
of a measurement of Ñ with the system in an arbitrary state. In the case 
of the discrete spectrum (6) which we are considering, the discreteness is 
nowhere specially postulated, but appears as a consequence of very general 
requirements imposed on the eigenfunctions. This is similar to the theory 
of an oscillating string in which the discreteness of the sequence of possible 
modes of vibration follows from the boundary conditions imposed. The 
determination of all possible values of a physical quantity by constructing 
the spectrum of its corresponding operator is commonly called the “‘quanti- 
zation” of this quantity. 

We must now consider certain basic properties of the eigenfunctions of a 
linear self-adjoint operator @. We shall confine ourselves to the case in 
which the spectrum has the form (6) and shall first examine the set of eigen- 
functions belonging to a single eigenvalue a, . Because of the linear property 
of the operator Q, if Ux and Uj are two different eigenfunctions belonging 
to the eigenvalue a, , then MU + A2U x2 , where à; and `: are any complex 
numbers which are not both zero, will also be an eigenfunction belonging 
to the same eigenvalue ao, . This means that the set of eigenfunctions that 
we are considering forms a linear manifold £, . In the cases to be considered 
in this book, the manifold &, always has a finite number of dimensions m. 
This means that the manifold £, contains m linearly independent functions 
Ur, Urz, +++ , Urm , but any m + 1 functions of this manifold are linearly 
dependent. Consequently, any function U of the manifold £, can be ex- 
pressed in a unique way in the form 


U = Dj dU, 


where the A; are complex numbers. The functions Uz; (J = 1, 2, ++, m) 
form a linear basis of the manifold +. This basis can be chosen in various 
ways. In particular, well-known processes of orthogonalization always make 
possible the construction of a basis consisting of mutually orthogonal 
functions U,; [the functions U, and U: are mutually orthogonal, if 
(Ui, Uz) = 0). 

The basis Uz; (j = 1, 2, --- , m) is called normalized if (Ui; , Ux3) = 1 
(1 <j < m), i.e., if all functions forming this basis are normalized. 
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Let Vir, Vio, «+: , Vim be any m eigenfunctions of the operator Q, 
belonging to the same eigenvalue a, of this operator as the functions U;,;. 


Then, 
(7) Vig = Dia MUk (1 <g <m), 


where the d,; are complex numbers. Hence, for! <g < m,1<h<m, 
we have 


(Vio, Vin) = (defer MoUs, Doim MU es) 
= DOR Dota Mtl Urs, Uri). 
Since the basis U+; is orthogonal and normalized, 
(Uri, Uri) = i, 
where 


ff Gsi), 
= 10 (Gj #4). 


Therefore, we find 
(Vis, Vin) = Dogan Dota Siki = DoT Ayia - 


In order that the functions Vz , defined by the relations (7), be orthog- 
onal and normalized, it is necessary and sufficient that 


PA 1 = h), 
Dot MA = i‘ ; ay a 


or, equivalently, that 
(8) Dita dots = dn (LS gSm,1<h<m). 


The matrix consisting of the complex numbers jy, (1 <g <m,1<h<m) 
(and also the linear transformation (7) corresponding to this matrix), is 
called unitary, if the conditions (8) hold for it. Since the functions Vx, are 
orthogonal and normalized, they are also linearly independent and conse- 
quently form an orthogonal and normalized basis of the manifold &% . 
[Indeed, if, for example, we had 


Vian = aV ig + ara + OmV km , 
where the a; are numbers, then we would find 
(Via, Via) = r= (Vir, Vix) = 0, 


which is impossible.] Thus, if the functions Ur; form an orthogonal normalized 
basis of the manifold Q, , then all other such bases Vro (1 < g < m) of this 
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manifold are obtained by subjecting the system of functions Ur; (1 <j < m) 
to all possible unitary transformations (7). 

The number of dimensions m of the manifold & is a very important 
property of the eigenvalue a, and is called its multiplicity or degree of de- 
generacy. The eigenvalue a, is degenerate if m > 1 and non-degenerate if 
m = 1. From the physical point of view m gives the number of linearly 
independent states of the system in which the quantity Y has the definite 
value a. 

Now let U, and U; be two eigenfunctions of the operator Q, belonging 
to different eigenvalues ar, and ær, , so that 


(9) QU: = o%,U,, QU, = a,U2. 
In virtue of the self-adjoint property of the operator @, we have 
(QU, , U2) = (U1, QU). 
Hence, by (9), 
(ar Ui, U2) = (Ui, o,U2). 


Since œs, and az, are real, this last equation and property 2 (§1) of scalar 
products lead to 


a, (U > U2) = an,(U;, U2), 
and the fact that ar, ~ ox, , to 
(U,, U2) = 0. 


This shows that two eigenfunctions of the operator Q, belonging to different 
eigenvalues, are always mutually orthogonal. 

If we now choose for each eigenvalue a, of the operator @ some linear 
orthogonal basis, then the whole set of eigenfunctions of the operator @ so 
obtained obviously forms a (denumerably) infinite orthogonal system of 
elements U of a complex Hilbert space. In the simplest and most important 
cases, and in particular, for all cases with which we shall be concerned, this 
orthogonal system is complete (or closed). This means that every element U 
orthogonal to all elements of the system we have obtained must be iden- 
tically zero. The completeness of our system of eigenfunctions is very 
significant in physics. If we denote by U, , U2, -+--, Un, ++- the elements 
of a complete orthogonal system, enumerated in arbitrary order, then any 
element U of the complex Hilbert space can be represented in the form of a 
series 


Donel Cn U n 


which converges to U “in the mean”. [This means that setting Dora: e.U, = 
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Sa , we have (U — sn, U — sa) ~O0asn — ~.] Here the c, are complex 
constants. In virtue of the mutual orthogonality of the functions U, we 
easily find 


Cn = (U, Un)/ (Un , Un). 
Jf the functions U, are normalized, then 
Cn = (U, Un), 
_and furthermore, 
(U, U) = ERa cal Un, U) = Pha cal U, Un)* = Fa] en f. 
In particular, if the function U is also normalized, then 


fai | Cn ? = 1, 


In the preceding section we saw that the ability to assign to each physical 
quantity a definite linear self-adjoint operator allows us in principle to 
determine the statistics (distribution law) of this quantity in any state U. 
We are now in a position to show concretely how this is done, at least for 
quantities with a discrete spectrum. Let the operator @, with the discrete 
spectrum (6), correspond to the quantity M. Since the only possible values 
of A are the numbers œ, , the statistics of Y in any state U will be fully 
determined, if we can find the probability 


P(A = ar) 


for each aœ; . As in §2, we denote by $ a quantity equal to 1 if A = a, 
and equal to zero if X + a, (so that $ is a function of Y). Let G be the 
linear self-adjoint operator corresponding to the quantity B. Then the 
probability that A = a, is the same as the probability that 8 = 1, and is 
obviously equal to the mathematical expectation of B. 

Thus, we have (assuming the function U to be normalized) 


Py(M = a.) = (BU, U). 


Now let U1, U2, +++, Un, +++ be the complete orthogonal system of 
normalized eigenfunctions of the operator @ and let 
U = D katUn, Ce = (U, Us) (n > 1). 
Then 


(QU, U) = (o816.8U, , Dhi CnUn). 


Each of the functions U, is an eigenfunction of the operator @, i.e., the 
quantity % has a definite value in each of the states U,. From this it 
follows that the quantity %, as a function of A, must also have a definite 
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value in each of the states U, . This value is equal to 1 in those states for 
which A = a, , and is equal to zero in the other states. Thus, each of the 
functions U, is an eigenfunction of the operator ®; the corresponding eigen- 
value of this operator is 1 if @U, = a,U, , and zero in all other cases. We 
can therefore write 


BU, = BrUn ; 
where 8a = 1 if QUn = aUn and 8, = 0 otherwise. Therefore, we obtain 
(GU, U) = (X scabs ag Denes) 


By using the properties of scalar products, the orthogonality of the sys- 
tem of functions U, , and the normalized character of these functions, we 
now easily find 


Py (A = ak) 


(BU, U) = DRaa Bn | Cn f 
= D” | €n p = > | (U, Un) p 


where >,“ means that the summation is taken over all values n for which 
the eigenfunction U, belongs to the eigenvalue a, . Applying the theorem 
on the composition of probabilities to this expression, we find 


Pola < A <b) = Veawlenl = Yew | (U, Un) P, 


where the summation is taken over all eigenfunctions U, of the operator @ 
belonging to eigenvalues included between a and b. In this way we obtain a 
definite rule for finding the distribution law of the quantity % in the state 
U: Expand the function U in terms of the complete orthogonal system of 
normalized eigenfunctions U, of the operator @. The probability that 
a < A < b is the sum of the squares of the absolute values of the co- 
efficients of this expansion for all those functions U, that belong to eigen- 
values included between a and b. 

We now apply the complete orthogonal and normalized system of eigen- 
functions of the operator @ to prove one more elementary proposition which 
we shall require later: Jf the operator Q has the spectrum (6), then the spectrum 
of the operator Q? consists of the numbers 


2 2 2 f 
ay, a2, OR t 


and the manifold of eigenfunctions of the operator @’ belonging to the eigenvalue 
ax coincides with the manifold of eigenfunctions of the operator @ belonging 
to the eigenvalue a, . 

Indeed, if U is an eigenfunction of the operator @ belonging to the eigen- 
value a, , then 


@U = @(@U) = @(aU) = «QU = a, U, 
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ie., U is an eigenfunction of the operator @” belonging to the eigenvalue 
2 
ak - 
Now suppose that, conversely, U is an eigenfunction of the operator @? 
belonging to some eigenvalue £ of this operator. 
We expand the function U in terms of the complete orthogonal system 
of eigenfunctions of the operator @ (which, as we have just ascertained, are 
at the same time eigenfunctions of the operator @’): 


(10) U = oa CkUk , 
where 
cr = (U, U;) (k = 1,2, ---). 


Since U and U; are eigenfunctions of the same operator @’, cs ~ 0 only for 
B = ar. This shows that an eigenvalue £ of the operator @’ must necessarily 
coincide with one of the numbers ax, i.e., that the spectrum of the operator 
@ actually gives the set of numbers œx. Now let 8 = a,’ and let Un, 
Une, +++ » Urn be the eigenfunctions of the operator @ from the orthogonal 
system we have chosen, belonging to the eigenvalue a, . Then the only non- 
vanishing coefficients in (10) are those of the functions Us: (1 < i < m). 
Hence, 


U = beer Cus Ug: , 
and consequently, 
QU = aU, 


i.e., U is an eigenfunction of the operator @, and our proposition is proved. 


§4. Evolution of the state of a system in time 


In classical mechanics the state of a system is described by the values of 
all 2s Hamiltonian variables qı, --:, @s, P1, ‘°° Ps. In order to know 
how this state changes with time, it is necessary to give all 2s of these 
variables as functions of time. The fact that the values of the Hamiltonian 
variables at any initial time ż uniquely determine their values at any other 
(preceding or following) time ¢ is an expression of the principle of causality 
in classical mechanics. 

In quantum mechanics the state of a system is described by a certain 
function U(q, q@, +-+- , qs) of the generalized coordinates. The change of 
the state with time will be known if the dependence of the function U on 
the time is given, i.e., if U is given as a function of the s + 1 variables 
h, Ae 2 


U= Un, sr Msa t). 
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In quantum mechanics an expression of the principle of causality is the 
requirement that the form of the function U at any initial time ¢ uniquely 
determine its form at any other (preceding or following) time ¢. Since, in 
quantum mechanics, the function U determines the statistics (distribution 
law) of physical quantities and not their values, this requirement has the 
physical meaning that the statistics of physical quantities, at the initial 
time to , uniquely determine their statistics at any other time t. 

In classical mechanics the law of motion (i.e., the expression of the Hamil- 
tonian variables as functions of time) is found by solving the ‘equations of 
motion”. In the case of Hamiltonian variables these equations assume the 
well-known ‘‘canonical” form, which is remarkably simple and symmetric. 
The fundamental role in these equations is played by the so-called Hamil- 
tonian function 


A(q, taas, Pis saie g Da) 


When the forces acting on the system are independent of time, the expres- 
sion for the Hamiltonian function is the same as the expression for the total 
energy of the system 


H=T+Y, 


where T is the kinetic energy, and V the potential energy of the system. 

In quantum mechanics the Hamiltonian function is assigned a linear 
self-adjoint operator 3C, the operator of total energy, also called the Hamil- 
tonian. This operator is just as important for the determination of the 
evolution of the state of a system in time as the Hamiltonian function is in 
classical mechanics. 

In the simplest cases the expression for the operator X£ can be chosen 
correctly by direct analogy with the expression for the Hamiltonian func- 
tion in classical mechanics. If our system is an elementary particle, then the 
simplest procedure is to choose as the Hamiltonian variables its three Car- 
tesian coordinates z, y, z and the corresponding components of momentum 
Pz , Py, pz - In this case the kinetic energy will be 


T= (ps H Pr ay ps )/2m, 


where m is the mass of the particle, and the potential energy V(x, y, z) 
will depend only on the coordinates of the particle. Hence the expression 
for the Hamiltonian function is 


H = (1/2m) (pf + py + pz) + V(a, y, 2). 


To construct the operator 3C, we recall that the operators corresponding 
to the quantities P+ , Py , pz are — iñ 0/dx, —ih 0/dy, —ih 0/02. According 
to the algebra of operators adopted in §2, the operators corresponding to 
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the squares of these quantities must be 
-i a°/dx’, -K P/a, —h o/a. 
Thus, the operator corresponding to the kinetic energy T will be 
—(h/2m) (8/32 + 8/3 + 0’/d2”) = — (K/2m)V, 


where V? = 0°/dx” + 0°/dy’ + 3/32 is the so-called Laplacean operator. 
Since the operator corresponding to the potential energy V(x, y, z) is simply 
the operator © of multiplication by V(x, y, z), we naturally take as the 
Hamiltonian operator 


(11) X = —(K'/2m)V? + V. 


The operator 3 is constructed just as simply in the somewhat more com- 
plicated case of a system consisting of several such structureless physical 
particles, if it is assumed that interaction forces between the particles are 
absent. In this case the total energy of the system is simply the sum of the 
total energies of the particles composing the system. We naturally assume 
that the Hamiltonian of this system is equal to the sum of the Hamiltonians 
of the particles composing the system where the latter are constructed ac- 
cording to equation (11). 

Such an immediately obvious determination of the operator JC is possible 
only in the simplest cases. Even if the system consists of a single structure- 
less particle, the procedure is this simple only when rectangular coordinates 
are chosen as the Hamiltonian variables. In other coordinate systems the 
kinetic energy is a quadratic form in the generalized momenta, in which 
the coefficients depend on the generalized coordinates. In this case the 
proper choice for the corresponding operator is not so obvious. 

We now turn to a discussion of the laws of quantum mechanics which 
determine the change of state of a system with time. For brevity, we denote 
the function U(q, ++- , qs, ¢), which describes the state of a system at 
time t, by U(q, t). If, as we require, the specification of the function U(q, to) 
uniquely determines the function U(q, t) for all t, then in particular the 
function dU/dt is uniquely determined at £ = to. (We assume, of course, 
that this derivative exists.) Since ż was chosen arbitrarily, we can therefore 
say that for any t some function dU/dt uniquely corresponds to each func- 
tion U. We have agreed to call such a correspondence an operator. Thus 
the function dU/dt is the result of the action of a certain operator on the 
function U. In general, this operator can be different for different times, so 
that it will be convenient to denote it by @,. Hence, 


(12) ðU (q, t)/dt = QU (q, t). 


We still have to determine the operator @:. By a heuristic method, which 
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involves the analysis of a number of simple special examples and the gen- 
eralization of the results so obtained (and includes the application of a 
number of theoretical considerations), it has been found possible to estab- 
lish the universal form of this operator. It. turns out that one must set 


Q: = —13/h, 


where #t is Planck’s constant divided by 27 and 3¢ is the Hamiltonian of the 
system. This rule is always valid. In particular, it remains true when vari- 
able forces act on the system, i.e., when the operator 3 depends on the 
time and is, in general, no longer the operator of total energy. However, in 
all that follows we shall suppose that the operator X is independent of the 
time, and, consequently, that it corresponds to the physical quantity which 
we call the total energy of the system. Equation (12) then takes the form 


th aU(q, t)/dt = KU (q, t), 
or, more briefly, 
(13) th aU/at = KU. 


This is the most important equation in quantum mechanics, and is 
analogous to the system of equations of motion of classical mechanics. It 
is commonly called the Schrédinger equation. It defines the function 
Ul, +, 4s, t) as the solution of a certain partial differential equation. 
We shall see that this is always an equation of first order in ¢, but of the 
second order in the generalized coordinates q, , since the kinetic energy 
operator which forms part of the operator 3¢ contains the operators 0°/dq’- 

As our first application of Schrédinger’s equation we prove an important 
auxiliary theorem which will be needed later. We call the quantity 


(U, U) = f 1U È dq -++ da 


the norm of the function U so that in particular, for normalized functions U, 
the norm is equal to unity. We now show that the norm of any solution to 
Schrödinger’s equation remains constant in time (i.e. the functional 
(U, U) is an integral of the Schrödinger equation). In particular, if the 
function U which describes the state of a system is normalized [(U, U) = 1] 
at to, then it will also be normalized at any other (preceding or following) 
time £. 

To prove this, we start with equation (13), where U = U(qm,@,°-::, 
qs , t). Forming the complex conjugate quantities yields 


(14) —th aU*/at = (KU)*. 
We multiply equations (13) and (14) by U* and U, respectively, and 
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subtract the second equation from the first. We find 
ih[U*(OU/at) + U(aU*/at)) = tha | U P/at = (KU)JU* — U(RU)*. 
Integrating this equation over the entire configuration space, we obtain 
ih[d(U, U)/dt] = (XU, U) — (U, KU). 


The right side of this equation is equal to zero, because of the self-adjoint 
character of the operator 3. Therefore, 


d(U, U)/dt = 0, 
which was to be proved. 


§5. Stationary states. The law of conservation of energy 


In the cases of interest to us, the spectrum of the total energy operator 3 
of the system will always consist of a sequence of numbers of the form (6) 
of §3. Therefore, we can select a linear orthogonal basis 


Üs Pi hed = Ba 


of normalized eigenfunctions of the operator X. Let the function U, belong 
to the eigenvalue E, of the operator 3, so that 


(15) KU., = BLU, (n = 1, 2, ` H 


(In general, because of the possibility of degeneracy, the numbers E,, may 
contain sets of successive numbers that are equal to each other.) 

A solution U(q, t) of Schrédinger’s equation can then be expanded in 
terms of the functions U, . The coefficients of this expansion will, of course, 
vary with £, so that we can write 


U(q, t) = nai an(t) U,(q), 


where a,(t) are complex functions of t. Substituting this expression into 
Schrédinger’s equation (13) and assuming, as we have agreed, that the 
operator 3C is independent: of time, we find in virtue of equation (15) 


th Daai (da,/dt)Un(q) = WA ant) KUn (q) = wal E,a,(t) Un(q). 


Because of the uniqueness of expansions in terms of the functions U,,(q) 
we can equate the coefficients of U,(q) on the left and right sides: 


ah da,/dt = Enan(t) (n = 1, 2, e) 
From this we readily find 
a,(t) 2 cre E, 


Consequently, 
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(16) Ulag, t) = Lea cre ™ UCA), 


where the cn are complex constants. Since we have taken the functions 
U,(q) to be normalized, it follows that 


(U, U) = Dina len |. 


Accordingly, if we wish the function U(q, t) to be normalized, we must 
have 


(17) De ar hea = 1; 


Thus, the general normalized solution of Schrédinger’s equation can be 
represented in the form (16), where the complex constants c, must satisfy 
(17) for the solution to be normalized. Conversely, if the c, are arbitrary 
complex numbers satisfying (17), then (16) gives us a normalized solution 
of Schrédinger’s equation. 

In particular, the functions 


(18) e Ula) (n = 1,2, ---) 


are normalized solutions of Schrédinger’s equation. The states of the 
system described by these functions are commonly called stationary states. 
This name is justified by the fact that if the state of a system is described 
by a function of the form (18), then the states of this system at the times 
t, and t, are described by functions that differ from each other only by a 
constant factor, and, therefore, the statistics of any physical quantity will 
be precisely the same at these two times. In other words, in a stationary 
state the statistics of physical quantities are independent of time. Since 
these statistics are all that we can obtain from a knowledge of the state 
of a system in quantum mechanics, we must conclude that when the evolu- 
tion of a system is described by a function of the form (18), the physical 
state of this system remains unchanged in the course of time. 

Functions of the form (18) are eigenfunctions of the total energy op- 
erator X£ and, consequently, describe states of the system in which its total 
energy has a definite value. In this way we arrive at the important conclu- 
sion that, if the total energy of a system at a certain instant of time has a 
definite value (for example, if at that instant a measurement of the total 
energy has just been made), then the physical state of the system in its 
further evolution will remain unchanged. This conclusion is particularly 
interesting since there is no analogous result in classical mechanics. Indeed, 
in classical mechanics the total energy always has a definite value and does 
not. change with time (if the external forces remain constant). The Hamil- 
tonian variables, however, which evolve according to the equations of 
motion, constantly change their values; and since the state of the system 
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in classical mechanics is defined by a set of these values (or, equivalently, 
by a point in its phase space), this state is subject to continuous change. 
The various “ergodic” theorems or hypotheses even try to establish that 
this change has an extremely broad character, i.e., that the state of the 
system in the course of time comes arbitrarily close to any state consistent 
with the given value of the total energy. We now see that in quantum 
mechanics similar ergodic postulates or theorems are impossible, at least in 
states in which the total energy has a definite value. However, statistical 
thermodynamics is primarily concerned with just such states. 

We must now determine the physical quantities in quantum mechanics 
which are integrals of the motion. In classical mechanics a function of the 
Hamiltonian variables whose value, in virtue of the equations of motion, 
remains constant in time is called an integral of the motion. In quantum 
mechanics Schrédinger’s equation serves as the equation of motion. The 
function U(q, t) determined by this equation does not permit us to find the 
values of physical quantities, but only their distribution laws (i.e., their 
statistics). Therefore, Schrédinger’s equation can have as a consequence 
the constancy in time of at most the distribution law of a physical quan- 
tity, but not of its value which in general is not determined. Hence in 
quantum mechanics we must call any physical quantity, whose distribution 
law is time independent, an integral of the motion. 

The mathematical apparatus of quantum mechanics yields a characteri- 
zation for integrals of the motion which is remarkably simple. It is ex- 
pressed by the following proposition: 

THEOREM. In order for a physical quantity A to be an integral of the motion, 
it ıs necessary and sufficient that the corresponding operator Q commute with 
the total energy operator 3C of the system, i.e., that GKR = KG. 

We prove only the sufficiency of this criterion, since we shall not require 
its necessity. For this purpose, we first prove the following very general 
auxiliary proposition: 

Lemma. Let Q and @ be two linear self-adjoint operators with discrete spectra, 
and let @@B = BG. Then there exists a complete orthogonal system of eigenfunc- 
tions of the operator Q, such that all of the functions of this system are simul- 
taneously eigenfunctions of the operator B. 

Proof of the Lemma. We begin with an arbitrary complete orthogonal 
system of eigenfunctions of the operator @. Let a be any eigenvalue of the 
operator @ and let U be any eigenfunction of the operator belonging to this 
eigenvalue, so that 


aU = aU. 
[f Q8 = BQ, then 
QRU = RQU = B(aU) = aBU, 
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ie., the function BU ts also an eigenfunction of the operator @ belonging to 
the same eigenvalue a. 

Now suppose the eigenvalue a of the operator @ has multiplicity (degree 
of degeneracy) m, and let the eigenfunctions belonging to this eigenvalue 
in our chosen complete orthogonal system be U, , U2, -+- , Um. We have 
just shown that the functions @U; (k = 1, 2, --- , m) are also eigenfunc- 
tions of the operator @ belonging to the eigenvalue a. Consequently, we 
must have 


(19) BU, = D aUi (k = 1, 2; a ,m), 
where the ar are complex numbers. 
Now let 
yV = papel ce. U;, 


be a linear combination of the functions U, with arbitrary complex coeffi- 
cients c , so that V is also an eigenfunction of the operator @ belonging to 
the eigenvalue a. We shall try to choose these coefficients so that the func- 
tion V will also be an eigenfunction of the operator @. For this to be true, 
we must have 


(20) B pas Ur = B Jy CUr , 


where 8 is a complex number. But because of (19) and the linearity of the 
operator 8, 


6 Viel SD ieee = ee reals 
or, by changing the designations of the summation indices, 
G Dra Ur = Drac Deoa Us = Da (ei CRU 
Therefore, (20) gives 





ict Daa Cid; — Ber|Ur = 0, 


from which, by using the lincar independence of the functions Ur, ++, Um, 
we have 


yori Capi — Ber = 0) (k = l, 2; et m). 


Writing as before, 


-[ ç G=k), 
*" 10 (=k), 


we can rewrite the above system of equations in the form 


a (ari — Bõri)ci = 0 (K = 1,2,-+-, m). 
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Accordingly, we obtain a system of homogeneous linear equations for 
the determination of the coefficients c;. In order that this system have a 
non-trivial solution, it is necessary that its determinant be zero. This, ob- 
viously, gives an equation of degree m in the parameter 8. Let 61, 62, °, 
Bm be the roots of this equation. To the root 8, there then corresponds a 
system of coefficients cx (1 < i < m), for which 


V, = ye cxU; (k = l, 2, wack m) 


is an eigenfunction of the operator @, belonging to the eigenvalue 8; . (For 
the sake of simplicity we shall not consider the case in which the numbers 
B: are not all distinct. All of our conclusions remain valid in this case and 
merely require a little more discussion.) At the same time V, is also an eigen- 
function of the operator @, belonging to the eigenvalue a. The system of 
functions Vi, V2, +++, Vm can be used to replace our originally chosen 
system U, Uz2, --: , Um as the linear basis of the manifold of eigenfunc- 
tions of the operator @, belonging to the eigenvalue a (the orthogonality 
of this basis follows from the fact that the V, are eigenfunctions of the 
operator @, belonging to different eigenvalues 6, of G). After making such 
a replacement for all eigenvalues a of the operator @, we obtain, by means 
of the functions V, , a complete orthogonal system of eigenfunctions of the 
operator @. This verifies the lemma stated above. 

Proof of the Theorem. Now let U1, U2, --- be a complete orthogonal 
system of functions all of whose members are simultaneously eigenfunctions 
of the operators @ and #. (This system exists in virtue of the lemma 
proved above.) Expanding the solution U(gq, t) of Schrédinger’s equation 
in terms of this orthogonal system of functions we obtain 


U(q, t) = does ax(t)U(q), 


where a;(t) are complex numbers which may depend on t. Since the U;(q) 
are eigenfunctions of the operator @, the general result of §3 shows that 
the probability that the quantity A assume a particular one of its possible 
values is expressed uniquely by means of the quantities | a(t) |’. On the 
other hand, the functions U,(q) are also simultaneously eigenfunctions of 
the operator #, so that, as we saw at the beginning of the present section, 
alt) = ce, 

where the c: are complex constants. 

From the above it follows that | a(t) |’? = | c |? and, consequently, that 
the numbers | a(t) |’, and also the probabilities mentioned above, do not 
change with time. This obviously means that the quantity % is an integral 
of the motion, and the proof of the theorem is thus complete. 

Since the operator X commutes with itself, it follows as a special result 
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of the above theorem that for conservative systems (i.e., systems in which 
the total energy does not depend explicitly on the time) the Hamiltonian 
function H(q., °++, Ge, Pi, >*t, Pe) is always an integral of the motion. 
Thus for a conservative system the distribution law of the total energy 
is time independent. 

We must regard this result as the expression of the law of conservation of 
energy in quantum mechanics. 


Chapter III 
GENERAL PRINCIPLES OF QUANTUM STATISTICS 


§1. Basic concepts of statistical methods in physics 


In the following, we shall call physical theories phenomenological if they 
are developed independently of the atomistic model of the structure of 
matter. In phenomenological theories, the state of a system is specified by 
the values of a small number of variables which characterize that system. 
Thus, the state of a given mass of gas, not under the influence of any ex- 
ternal force field, is completely specified by the values of its volume and 
temperature, so that every other variable characterizing its state (e.g., the 
pressure of the gas) is a function of these two basic characteristics. From 
the point of view of a statistical theory, the state of a gas is not uniquely 
defined by giving its volume and temperature, because there are countless 
different combinations of positions and velocities of the particles of the gas 
consistent with given values of the volume and temperature. The state of 
a gas is uniquely defined in a classical statistical theory only when the po- 
sitions and velocities of all of its particles are given. Thus each distinct 
state in the sense of a phenomenological theory comprises a countless set 
of states which are distinct in the sense of a statistical theory. This fact 
should cause no essential difficulties if every physical quantity which is of 
interest in the phenomenological theory (and hence a function of the vol- 
ume and temperature of the gas) has, in the sense of the statistical theory, 
the same value in all those different states consistent with the given volume 
and temperature. However, this is not so; the statistically defined pressure 
of a gas, for example, can be very different for different combinations of 
positions and velocities of the particles of the gas, all of which are consistent 
with the given volume and temperature. 

The situation is precisely the same in the general case. From the point 
of view of a phenomenological theory the state of a physical system is com- 
pletely described by the values of a small number of physical quantities 
AD, Yo... , AW, Every other quantity B, which characterizes the 
state of the system, is a function of the quantities UA, YW, --- , ym, 
In a statistical theory, one usually has not one, but countless states of the 
system consistent with given values of the quantities AU, 2%, -.-, 
A». In these states, as a rule, the quantity Y will assume different values. 
(In quantum physics, besides this problem there is another specific diff- 
culty: In a statistically defined state, a given quantity does not, in general, 
have a definite value, but has only a certain probability distribution. We 
will not dwell on this difficulty at the moment.) Now, if the statistical 
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theory is to provide a foundation for the phenomenological theory, then 
it must be capable of predicting the values which Y can assume for fixed 
values of the quantities A, A, ---, XV and of explaining why ex- 
periments (in agreement with the phenomenological theory) invariably give 
the same value for X. 

We shall examine this situation in a little more detail. If the values of the 
quantities 1, 9, .--, YX are known, then from the point of view 
of the phenomenological theory the system is in a completely defined state 
U. (This state should not be confused with the quantum-mechanical state 
of a system for which the symbol U was also used.) From the point of view 
of the statistical theory, U represents not one state but a whole family of 
states U,, U2, ---. (This family, in general, has the power of the con- 
tinuum, so that it is impossible to enumerate its members. We avoid the 
general case here only for simplicity of writing; the loss of generality is of 
no importance to us at this point.) The quantity 8 which in a phenomen- 
ological theory has a completely defined value in the state U, takes on, 
in general, different values B; in thedifferent states U; (i = 1, 2, ---). How 
can we reconcile these facts? 

Suppose that we measure the quantity 8 many times in succession, in 
such a way that from the phenomenological point of view the system stays 
in state U. (It is immaterial whether we experiment on one system at dif- 
ferent times or on a whole set of identical systems at the same time.) From 
the statistical viewpoint, we shall be concerned in these experiments with 
different states U, , U2, --- of the system, and so the results B,, Bo, +> 
will not all be the same. To the question “What is the value of the quantity 
% in state U?” the statistical theory can only respond with an arithmetic 
average of the results obtained. If we make n measurements and if the sys- 
tem is found to be in state U, exactly n, times, state U: exactly n times, 
and so on, then we answer that in state U 


B= (mB, + nB: + +++ )/n = MBı + MB + cae 


where the \,’s are the relative frequencies of the states U; in this set of ex- 
periments. 

The numbers B, , B,,--- are, of course, given by the theory. But the 
numbers A; — the relative frequencies of the different states U; which are 
members of the family U — cannot be determined by any theory. They de- 
pend on the conditions of the experiment and change completely from one 
set of experiments to another. 

This state of affairs can be described further in the following way: From 
the point of view of the statistical theory, U is a family of states U; , with 
corresponding values B; of the quantity X. Hence, we can merely say that 
% has a certain mean value 
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<B> = dB; A20, ÈN = 1) 


in the state U. The choice of the statistical “weights”? A; amounts to the 
choice of a principle of weraging. In general, it is clear that different meth- 
ods of averaging will lead to different mean values of YX. In order for the 
value of 8 predicted by the theory to agree with the arithmetic average 
of the results of a given set of experiments, the statistical weight A; should 
agree With the relative number (“quota”) of those experiments in which 
the system is found in state U; (and hence for which $ has the value B;). 

From what has been said, it is clear that this situation is really difficult. 
On the one hand, the phenomenological theory definitely ascribes to the 
quantity V a unique value in state U (i.e., in all the states U;), which this 
quantity must therefore assume in each experiment. On the other hand, 
the statistical theory asserts that in different experiments the quantity 8 
will assume different values, and, as an analog to the phenomenologicai 
value of $, it can at best offer a mean value <B> of the possible results 
B, of the measurements. 

The situation is actually worse than this. It turns out that for the calcu- 
lation of this mean value <>, the statistical theory cannot even offer 
a definite method of averaging (set of statistical weights \;), since from its 
point of view the method of averaging must depend on the conditions of 
experimentation and can change from one set of experiments to another. 

There is only one way to reconcile these contradictions which threaten 
the scientific theory: to suppose that the quantity 8 assumes identical (or 
almost identical) values in all (or almost all) of the states U; which appear 
as members of the family U. Actually, if almost all of the numbers B; are 
very near to one another, then one can expect, for any experimental condi- 
tions, that almost all of the results obtained will be nearly the same. Under 
these conditions, the mean value <B> will be independent, within very 
broad limits, of the principle of averaging we select; thus, it can be a suita- 
ble analog to the value given by the phenomenological theory to the quan- 
tity B in the state U. 

In constructing a statistical theory we follow just this path, as no other 
is possible. First. of all, we construct a method which yields a strictly de- 
termined procedure for finding the mean value <B> of the quantity V 
for given values of 1°”, AP, ---, AP. (These APs usually include the 
energy of the system and the so-called ‘‘external parameters”. Among the 
latter the most important is the volume occupied by the system.) Then 
we must demonstrate the suztability of this mean value <H>. The demon- 
stration consists in showing that in the overwhelming majority of states 
consistent with given values of the quantities A”, 4°, --- , A, the quan- 
tity B assumes values very near to <H>. After this is done, all difficulties 
disappear. The mean value <%> is also that value which B must assume, 
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according to the phenomenological theory, whenever the quantities A, 
A”, ---, 1 take on this fixed set of values. The statistical theory thus 
predicts that in an overwhelming majority of experiments the quantity 
$ will have a value near <B>. 

However, this solution of the problem, apparently so simple, calls for a 
great many reservations. First, it is necessary to keep in mind that in all 
the foregoing the quantity V cannot be any quantity which takes on a 
definite value in each state which the statistical theory assigns to the sys- 
tem. Thus, for example, if $ is the velocity of a definite particle belonging 
to the system, then none of what was said above is applicable. Although 
this quantity also has a definite mean value <H>, one cannot argue for 
the suitability of this average, because this would mean that for all, or 
almost all, measurements, the velocity of the chosen particle would main- 
tain an approximately constant value. But in regard to such quantities as 
the velocity of a discrete particle, everything described above is entirely 
inapplicable, because, from the point of view of the phenomenological (not 
statistical) theory, the very concept of a particle does not exist. It is im- 
mediately clear that every quantity 8, which can be defined in terms of 
the phenomenological theory must, in the light of the statistical theory, 
depend symmetrically on the states of all (identical) particles comprising 
the system. Thus, it makes sense to question the suitability of the mean 
value only for such quantities, and for them the question receives a favora- 
ble answer (at least in the most important and most frequently met cases). 

It is also necessary to bear in mind that the method which is selected in 
the statistical theory for the construction of mean values of physical quan- 
tities is a very natural one, but is nevertheless largely arbitrary. Of course, 
the fact that the mean values obtained by this method are found, post. 
factum, to be suitable, can be used to justify the principle of averaging. 
(Because of this suitability, clearly, any other method of averaging must 
lead, as a rule, to the same mean value.) However, this question requires 
more careful study. When we state that the value of H is near <B> 
in almost all states of the system, we nevertheless suppose (as we are forced 
to suppose by considering the known facts) that % differs substantially 
from <%> ina certain small exceptional set of states. What does the word 
“small?” mean here? How do we estimate the size of this set? 

As we saw above, in choosing a principle of averaging we assign “weights” 
à; to the states U; which are members of the family U. A set of such states 
is acknowledged to be “small” if the sum of the weights of all states be- 
longing to this set is small. Thus, one can say that every principle of averag- 
ing ascribes a definite weight to any set of states U;. Therefore, when we 
said above that B ~ <B> (~~ means “approximately equal to”) in al- 
most all states U; , the precise meaning of this statement was the following: 
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Bæ <B> for all states U; , with the exception of a certain set M of very 
small weight. Let us now consider some new method of averaging in which 
all sets of states receive new weights. If the exceptional set M, whose weight 
for the old mode of averaging was small, receives a very small weight for 
the new mode as well, then, as before, the states for which B ~ <B> re- 
ceive overwhelming weight, and the mean value given by the new method 
will not be essentially different from <B>. 

Thus, from the suitability of the mean value <H>, found from one 
method of averaging, we can infer that any new method will give approxi- 
mately the same mean value, if sets of states whose weights were small in the 
old method of averaging do not receive large weights in the new method. If this 
is not the case, the new method can lead to a mean value significantly dif- 
ferent from the old. If this situation is idealized mathematically, we can 
say that the choice of a principle of averaging is equivalent to the establish- 
„ment of a measure in the “space” of states U; ; and that, as the condition 
for the practical equivalence of the mean values of a quantity for two dif- 
ferent principles of averaging, we can take the mutual absolute continuity 
of the two corresponding measures. (A set of measure zero with respect to 
the first measure must have measure zero with respect to the second meas- 
ure as well.) 

In quantum statistics there is a case, very instructive and having re- 
markable consequences, in which a new principle of averaging replaced an 
old one because of certain fundamental considerations. The new principle 
led to a measure which was not absolutely continuous with respect to the 
previous one. The new mean values actually proved to be substantially 
different from the old ones and in many cases gave considerably better 
agreement with experiment. This situation occurred in the transition from 
the usual statistical concepts to the so-called “new statistics” — symmetric 
and antisymmetric. In the next section we shall have occasion to consider 
this example in complete detail. 

Finally, to follow our plan in the domain of quantum physics, it is very 
important for us to consider the following circumstance: Even for the most 
accurately defined states of the system, we do not as a rule know the exact 
values of physical quantities but only their distribution laws (and, in par- 
ticular, their mathematical expectations). This circumstance, as we shall 
see, does not prevent our establishing the suitability of mean values for 
the more important of these quantities, but merely complicates the neces- 
sary calculations. 


§2. Microcanonical averages 


Following the program we have planned, we shall now establish the 
principle of averaging which will be used throughout the book. 
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We shall deal only with those states in which the total energy E of our 
system is precisely fixed (“stationary” states). As we see from Chapter II, 
the set of such states is described in quantum physics by the set Me of 
eigenfunctions of the energy operator 3 of the system, corresponding to 
the eigenvalue E of this operator. Let U,, U2, +++, Um be a complete 
orthonormal system of functions of this family. Then any normalized func- 
tion U of this family can be represented (in a unique manner) in the form 


(1) U = Dim aU, ) 
where ai, a, °** , @m are complex numbers satisfying the relation 
(2) Dura | ox = 1; 


and, conversely, every function of the form (1), where the numbers a, are 
subject: to condition (2), is one of the (normalized) eigenfunctions of the 
operator 3C, corresponding to the eigenvalue E of this operator. In this way, 
the family of stationary states of the system having a total energy E is 
placed in one-to-one correspondence with the complex m-dimensional sphere 
(2). To establish a measure in this space, we can therefore choose any 
measure on the sphere (2). The most natural choice, of course, is the fol- 
lowing: We assume that. 


ar = rpe” (m > 0, Dir’ =1, O< p Lw, 1<k< m), 
and denote by S the real sphere 


bah re = 1, 


Then, in the space of the moduli rą, we establish the natural Euclidean 
metric of the real sphere S, and we assume that the phases g, are uniformly 
distributed on the interval (0, 27), independent of one another and of the 
moduli. Thus, the volume element of the complex sphere (2) for this meas- 
ure has the form 


dS dg: +++ den, 


where dS is the “surface element” of the real sphere S. In other words, if 
M is any measurable set of points on the complex sphere (2), and if 
Wai, --* , €m) is the characteristic function of the set M [i.e., W(ai, +--+ , am) 
is equal to 1 or 0, depending upon whether or not the point (a1, +> , am) 
belongs to the set M}, then the measure of the set M is proportional to the 
integral 


2r Qn 
fas { -f Flai, az, Am) dor dez +> dem. 
8 0) 0 


We normalize the measure established in this manner, so that the meas- 
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ure of the entire complex sphere (2) (and, hence, also the measure of every 
family of stationary states which we consider) is unity. [It is casily seen 
that the normalizing factor is (2r) "=~', where 5 = Qn?" /T (4m) is the 
Euclidean “area” of the sphere S.] Normalization is necessary in order that 
we may consider the measure of a set as its statistical weight in further 
averages. 

Now let f(U) be a quantity which assumes a definite value in each of the 
„stationary states U of our family. We can consider 


f(U) = f(aU, +. +t AmU m) 


as a function of the variables a, , a2, **- , am. According to the measure 
we have adopted, the mean value <f(U)> of the quantity f(U) is given 
by the integral 


<f(U)> = faj” wee Boa reU) de: oe Hee 


which can, of course, be written more briefly in the form 


<K> = f IÈ ats) ase, 


where S* is the complex sphere (2), and dS* is the “surface” element of 
this sphere. (dS* = udS dy, <- dem , where p is the aforementioned nor- 
malizing factor.) In the following, we shall call a quantity of the form f( U) 
a phase function of our system, and the mean value <f(U) > which we have 
defined, will be called its microcanonical average. We have established in 
this way a particular uniform principle for the construction of mean values 
of phase functions — the principle of ‘“microcanonical” averaging. 

We know, however, that the physical quantities which characterize the 
state of a system in quantum physics, in general, are not phase functions. 
Such a quantity A can assume different values in a given state U, and from 
quantum theory we can obtain the distribution law of these values. As we 
know from Chapter II, to the quantity A there corresponds a certain (lin- 
ear self-adjoint) operator @ which acts on the function U. The mathemati- 
cal expectation EvA of A in the state U is equal to the scalar product 
(QU, U). This rule, if we apply it to all quantities Y and all states U, gives 
us, as we saw in Chapter II, the distribution law of any physical quantity 
as well as its mathematical expectation. 

Therefore, in general, we cannot consider the physical quantity % to be a 
phase function, and hence cannot speak of microcanonical averages of such 
quantities. But the mathematical expectation (@U, U) of YX in the state U 
is always a phase function, so that we can speak of the microcanonical 
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average <(@U, U)> of this mathematical expectation: 
<(@U, U)> = <EvA> -f (QU, U) dS*. 
s+ 


From now on we shall call this microcanonical average of the mathemati- 
cal expectation of the quantity A the microcanonical average of the quantity 
A atself. (Such an extension of the concept of microcanonical averaging can 
hardly lead to an inconsistency, because, in the case of quantities which do 
not have a uniquely defined value in every state, the concept of microcanon- 
ical averaging has not been defined.) In this manner we obtain 


(3) <¥> = <EvA> -Í (QU, U) dS*, 
s* 


where @ is the operator corresponding to A. One can say that this relation 
defines the microcanonical average for phase functions whose values are 
not ordinary numbers but random variables. 

The construction of the mathematical expectation of a random variable 
always consists of a certain averaging process. Therefore, in quantum sta- 
tistics we shall at every step have to consider two averaging processes which 
are generally unrelated: the formation of the mathematical expectation of a 
random variable and the microcanonical average. It is extremely important 
never to confuse these two processes. In the sequel, the pair of angular 
brackets < > will always denote a microcanonical average. The term 
“mean value” will also always denote a microcanonical average and must 
not be confused with the mathematical expectation of a random variable. 
Wherever necessary, we shall denote the mathematical expectation of XM 
in the state U, by the symbol EvY. 

Now we reduce expression (3) for the microcanonical average of A to a 
form which is considerably more convenient for our later purposes. Since 


U = pa aU, , 
then by the linearity of the operator @, 
(QU, U) = (Soka Guy, ty a). 
Whence, by the known properties of the scalar product (see IT, §1) 
(QU, U) = Pri wat (QU, , Ui). 
Therefore, (3) gives 


<A> = 2 (@U;., ui) f oxa* dS*., 
lk=1 Ss? 


Fork 4 l, 
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2r 2r 
* ee m—2 t(eR~e1) = 0.: 
[ axay* aS u(2r) [nn dS Í Í e dp, dg. = 0; 
and fork = l, 
| |ar|? dS* = (1/m) Í {Saal} ast = a/m) f dS* = 1/m; 
s* s* |k=1 g> 


since, according to our normalization, the measure of the complex sphere S* 
is unity. Thus, we obtain 


(4) <A> = (1/m) X k=a(QU: , Ur). 


This means that the microcanonical average of any physical quantity is 
simply equal to the arithmetic average of the mathematical expectations of 
this quantity over all the states of some orthonormal system. Thus, micro- 
canonical averaging is equivalent to averaging over an orthonormal basis 
in which identical weights are ascribed to each term. 

We have based our expression for the microcanonical average on a definite 
linear orthonormal basis U, , U2, --- , Um of the manifold Mz. It is ob- 
vious, however, that to be a meaningful expression, it must be independent 
of the chosen basis. We now demonstrate this independence. (As a matter 
of fact, this independence follows from the invariance, with respect to the 
choice of orthonormal basis, of the measure introduced on p. 76.) If 
Vi, V2, ++: , Vm is another linear basis of the manifold Me , then (II, §3) 


V, = Di AaU: (1<k<m), 


where the numbers A, constitute a unitary matrix, and the summation 
here and in the following extends from 1 to m. Thus, 


(QV, Ve) = (Sidn@U; , Ds AnU;) 
Doig Aud (AU; , U;); 


and hence, 
Doe (Vr, Ve) = Deis (Doe Na) (QU; , Uj). 
But, by the unitary property of the matrix Ay, we have 
Dae didn = 855 (1 <ij <m). 
Thus, 
Der (OFr, Ve) = Dons 8:AU:, Uj) = È; (QU, U). 


This equality shows that (4) is independent of the chosen basis. 
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§3. Complete, symmetric and antisymmetric statistics 


The principle of microcanonical averaging, which we defined in §2, is 
obviously based on the idea of equal participation in the formation of mean 
values of all the eigenfunctions of the operator 3C corresponding to a given 
eigenvalue E. This was also the main idea in classical statistical mechanics. 
There too the microcanonical average was extended in a uniform manner 
over all points of phase space belonging to a given “‘surface of constant 
energy”. In classical mechanics this idea was based on the assumption that 
the equations of motion had no other single-valued integrals besides the 
energy integral (see Introduction, §2). Thus, if there exists an integral 7 
of the equations of motion independent of the energy integral, and if in a 
given system J = C, then only a subset of the set of points constituting the 
“energy surface” can actually represent states of the system. This subset 
constitutes a manifold of a smaller number of dimensions and is charac- 
terized by the energy of the system and the equation J = C. In this case 
the averaging in which all points of the “energy surface” share equally, will, 
as a rule, lead to incorrect results because in this averaging states which are 
inaccessible to the system would play an overwhelming role. Moreover, 
the fact that the methods of classical statistical mechanics, where applicable 
(i.e., where the transition to quantum physics is not required) always lead 
to results which agree well with experiment, doubtless shows that in these 
problems such “inhibiting” integrals are actually absent. 

In quantum physics a similar question arises. If we choose a method of 
averaging in which all stationary states of the system, corresponding to the 
same value of its energy, receive an equal weight, then we must assume that 
nothing prevents our system from actually finding itself in any of these 
states, i.e., that all of these states are accessible to the system. 

However, for the most important and frequently encountered systems of 
quantum physics, this assumption is incorrect. Such systems, as a rule, 
cannot attain all the states which are described by the eigenfunctions U of 
the operator X£, corresponding to its eigenvalue E. Only a very insignificant 
fraction of these states can be attained. A valid averaging method for such 
a system must therefore be significantly different from the microcanonical 
one, as we have defined it above. In a great many physically important 
cases this new averaging method leads to results which are substantially 
different from the microcanonical averages. We must now consider in detail 
how this comes about. 

Let our system consist of n identical and completely indistinguishable 
particles. We denote by (q; , p:) the set of all Hamiltonian variables of the 
ith particle. The Hamiltonian function of the system, due to the indis- 
tinguishability of the particles, must be a symmetric function of the vari- 
ables (qı , Pı), (q2 , P2), *** , (qn , Du). (We also allow the possibility of in- 
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teraction between the particles, and even allow the Hamiltonian function to 
depend explicitly on the time.) Clearly, the operator # of the system will 
exhibit the same symmetry with regard to any pair of particles. Let the 
function 


U = U(q, gst, gnt) 
satisfy the Schrödinger equation 
(5) th ðU/ðt = KU. 
We show that in this case the function 

Û = Ule, t, s qat), 


obtained from the function U by the permutation of the first two particles, 
is also a solution of equation (5). 
We denote by @ the operator which transforms any function 


f(a,@; “e*s Gest) 


into f(@,%,°°*,4,.,#) (the permutation operator of the first two parti- 
cles), and we analyze in more detail the meaning of the postulated sym- 
metry of the operator 3. To this end, we shall have to define the meaning 
of the permutation of the first two particles in some operator @. Let this 
permutation transform @ into an operator @. (We cannot denote this new 
operator by FQ, as would seem natural, because this symbol has another 
meaning for us.) Suppose now that we wish to permute the first two parti- 
cles in the expression @U. It is natural to think that for this purpose we 
must permute them both in the operator @ (the result will be @) and in 
the function U (the result will be @U). Therefore 


EQU = ACU. 
Setting GPU = V, whence U = OV, we find 
GV = eacv; 
or, in terms of operators alone, 
& = PQP. 


This gives us a natural definition of the operator @. The symmetry of the 
operator @ is expressed by the requirement @ = @, or 


PAP = Q, 
whence 


CPAP = QE = PQ, 
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i.e., a symmetric operator must commute with the operator & of the permu- 
tation of two particles (and hence also with the operator of a permutation 
of the particles among themselves). 

In particular, the symmetry of the operator 3€ implies 


PH = HO, 


i.e., the operators F and K commute with one another. Thus, from (5) it 
follows that 


ih a0 /at = Oh AU/at) = PERU = KEU = KOU, 


which proves our assertion. 
Therefore, the function 0 = @U will be a solution of the Schrédinger 
equation, and so also will the function 


ve = Yl, >,a, t) =U- Û. 


Let us suppose now, that at t = 0 the function U is symmetric with re- 
spect to qı and qo, i.e., that 


U(n, a; i > Qn, 0) = Ul, Q, yee + Qn, 0) 
identically in the qı , gz, +*+ , qa. In other words, we have yo = 0 identi- 
cally in the qı, q2, *** , qa . It follows that the norm of the function yo 


is also zero. Since, for functions satisfying the Schrödinger equation, the 
norm cannot vary with time (see II, §4), the function Ņ, has a vanishing 
norm for all t, from which it follows that y, = 0 for all t, qa, g2, ©, Qna. 
This means that 


U(qi, G25 °** 5 Qn, t) = U(qo,%5°** 5 Qnyt) 


for all values of these same variables. 

Thus, every function which satisfies the Schrédinger equation and is 
symmetric with respect to a certain pair of particles at a certain (arbitrary) 
value of t, preserves this symmetry for all other (preceding or following) 
values of t. If, therefore, the state of a system at t = 0 is described by an 
eigenfunction which is symmetric with respect to a pair of particles, then 
no state which lacks this symmetry is accessible to the system, either in the 
past or in the future. 

From the above it follows that if an eigenfunction 


U = U(a,@, iy > Qn, t) 


describing a state of the system, is symmetric with respect to any pair of par- 
ticles at ¢ = 0 (i.e., if it is a symmetric function of the variables q , q2, --- , 
qa), then it preserves this property at all other (preceding or following) 
times. For such a system only those states are accessible which are described 
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by eigenfunctions symmetric with respect to all the variables q; . Obviously, 
if the mean values of physical quantities are to have real meaning for sys- 
tems of this kind, the averaging must be carried out only over symmetric 
eigenfunctions, because all other eigenfunctions describe states which are 
fundamentally inaccessible to the system. Therefore, taking these other 
functions into account in the formation of mean values could lead to results 
which are incompatible with physical reality. 

As we know, the eigenfunctions corresponding to a given eigenvalue E 
of the operator 3¢ generate a linear manifold Mz , whose dimension we shall 
denote by m. The symmetric functions that belong to the manifold Nts 
also generate a certain linear manifold Gz , whose dimension s is, in general, 
less than m. If we know that our system can be found only in states which 
are described by symmetric eigenfunctions, then for such a system, ob- 
viously, the averaging must extend over the manifold Gz, and not over 
the manifold Ms . In all other respects, the considerations which led us to 
the choice of a natural principle of averaging over the manifold Wz re- 
main valid in the new case. We choose an orthonormal linear basis Sı, 
S2, ++ , Ss of the manifold Gz and, in analogy with formula (4) of §2, 
we define the microcanonical average of the quantity A by means of the 
relation 


(6) <A> = (1/s) ee (aS, , Sx), 


where @ is the operator assigned to A. 

If all of the possible stationary states of a system are described by sym- 
metric eigenfunctions, then we shall say that this system is subject to sym- 
metric statistics. For such a system, the microcanonical averages of all physi- 
cal quantities must be computed from (6). Along with systems of this kind, 
we shall also consider systems whose statistics bear an antisymmetric 
character. A function 


U = U(q, Q, tt, Qn) 
is called antisymmetric with respect to the variables q and qə if 
ÇU = Ul, q, doi s Qn) = -U 


identically, i.e., if interchanging the variables qı and qz causes only a change 
in the sign of the function. If q: is understood to be the set of generalized 
coordinates which describe the position of the 7th particle, then it is easy 
to see that antisymmetry with respect to a given pair of particles is, like 
symmetry, invariant with respect to the Schrödinger equation. From this 
it follows that if the state of a system is described at any time ¢ by an eigen- 
function antisymmetric with respect to any pair of particles, then the same 
description will obtain at all preceding and following times. For such a 
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system only states described by antisymmetric eigenfunctions are acces- 
sible. In complete analogy with the above, we must conclude that for sys- 
tems of this type antisymmetric statistics hold: All averages must be carried 
out over the linear manifold As of antisymmetric eigenfunctions (compris- 
ing, like Gz , only a part of the manifold tz). 

All of the above still does not answer the question of whether we must 
consider the statistics, to which the system is subject, as a special property 
of the component particles or as a mere accident for the chosen system. For 
example, let the system obey symmetric statistics. Then will every other 
system, composed of particles of the same nature, under any conditions 
necessarily obey symmetric statistics, too? 

Both experiments and many theoretical considerations, which we cannot 
discuss here, tell us that this is precisely the case. The kind of statistics 
obeyed by a certain system does not depend on the conditions in which it is 
found, nor on any random effects, but is entirely determined by the nature 
of the particles which constitute the system. The elementary material par- 
ticles (electrons, protons, neutrons) always obey antisymmetric statistics. 
Photons, on the other hand, are always governed by symmetric statistics. 
The type of statistics obeyed by material particles of a more complex struc- 
ture is determined by the number of elementary particles which comprise a 
single particle of the complex type. The statistics will be symmetric or anti- 
symmetric depending on the evenness or oddness of this number. Finally, 
both experimental and theoretical considerations lead us to say that for 
systems which are composed of free ‘“‘non-localized” particles (i.e., those 
not bound to a definite place) no other type of statistics, besides symmetric 
and antisymmetric, is encountered. If, on the contrary, the particles are 
localized in space, then they lose their indistinguishability. (The location 
of two particles at different places makes them distinct and enables us to 
distinguish between them.) In this case even the statistics which we origi- 
nally defined can be valid, i.e., those in which averages are extended over 
the entire manifold Mz. We shall therefore call them complete statistics. 

Thus, in quantum statistical physics we must invariably consider three 
distinct statistical schemes — complete, symmetric and antisymmetric. For 
each case we must select the scheme which is characteristic of the type of 
particle being studied. However, the general development of the theory for 
all three schemes is carried out along parallel lines. 

We ascribe to each physical system, which is composed of particles of a 
single type, an “index of symmetry” e. This index o equals 0, 1, or — 1, de- 
pending on whether the particles of the system are subject to complete, 
symmetric or antisymmetric statistics, respectively. Because the type of 
statistics, as we know, remains constant in time, we can consider e to be a 
peculiar “integral”? of the Schrödinger equation. Just as in classical me- 
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chanics each new single-valucd integral of the equations of motion reduces 
the manifold over which averages must be extended, so here too the presence 
of the “integral” o compels us (if ¢ = 0) to substitute for the manifold 
Me the reduced manifold Sg or Wx for the purpose of averaging. 


§4. Construction of the fundamental linear basis 
Suppose we have a system, composed of N identical particles of arbitrary 
structure, occupying a finite volume V. Let. 
(7) u(x), u(x), tis , Un(Z), i} 


be a complete orthonormal system (linear basis) of eigenfunctions of the 
energy operator of an individual particle; where, for brevity, we denote by 2 
the set of coordinates which define the position of the particle. We number 
them so that the eigenvalues (energy levels) will form a non-decreasing 
sequence: If the eigenfunction u,(2) corresponds to the eigenvalue ¢,, then 


a ey et Sey ee 
In the sequence £1, £2, +*+, &, ++ each energy level is repeated a number 
of times equal to its multiplicity (degree of degeneracy), i.e., a number of 
times equal to the number of functions in the sequence (7) which corre- 


spond to this level. 
Now we consider the function 


(8) U = Uri (G1) Ura (G2) cae Ury (Qn); 


where g, denotes the set of coordinates of the position of the zth particle, 
and ri, 72, +- , tw are arbitrary integers. Hence, U is a function of the 
positional coordinates of all N of the particles which compose the system 
(or, equivalently, of the positional coordinates of the system itself). 

Tnrorrem 1. The function U, defined by (8), ts an eigenfunction of the 
energy operator I of the system, corresponding to the eigenvalue 


(9) E = en + & + ++ + ery 


of this operator, 
Proof. Denoting by 3¢; the energy operator of the ith partiele, we have 


s = Pa IG, 
and hence 
U = $ia IGU. 
But, since the operator 3C; operates only on the function w,,(q:) 


KXU = unal) maak Uri a (Gia) [Reiter (Gi) Urs gs (Ginn) a? Ury (Gn) 5 
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and since u,;(q;) is an eigenfunction of the operator JC; , belonging to the 
eigenvalue £, , we have 


IC, (qi) fr Ur, (Gi). 


Il 


Hence, 
U = enU, 
and 
XU = Pina IGU = (Pim &,)U = EU, 
q.e.d. 


If the energy level Æ of the system is assigned beforehand, then any 
choice of integers r; (¢ = 1, 2, ---, N), which satisfy relation (9), gives 
us an eigenfunction U of the operator 3C, corresponding to this level. 

TuHEoreEM 2. The set of all functions of the form (8), obtained for all valucs 
of the integers rı , T2, °+*, ry , which satisfy relation (9), is a linear orthonor- 
mal basis of the manifold Mr of eigenfunctions of the energy operator R of 
the given system, corresponding to tts eigenvalue E. 

Proof. We show first that the set of all functions of the form (8), ob- 
tained for all possible values of the integers rı , 7, +++, rw , satisfying re- 
lation (9) or not, represents a complete orthonormal system of functions. 
The orthogonality and normality of this system follow in an obvious way 
from the corresponding properties of system (7), and the proof requires 
showing only the completeness of the system (8). This proof is most easily 
carried out with the help of induction on the number of particles N. As a 
matter of fact, for N = 1, the completeness of system (8) coincides with 
the assumed completeness of system (7). Therefore, take N > 1, and let 
our assertion be true for systems composed of N — 1 particles. Let 


elt, Ga, tte qx) 


be orthogonal to all functions of the form (8), so that for any values of 
the integers 71,72, °°°, TN, 


[= [eta ane Qn tr, (qr) era Ury (Gv) dq +++ dax = 0. 
Setting 
(10) fe, ees qu) tr, (qn) ~ Ury (qN) dq «++ dgx 


= Yri tee ry Gv) ’ 


we have 
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I => nk r see ty (Qu ) Ury (qu) dqy = 0. 


Because this equality holds for any ry , by the completeness of system (7) 
Yny aladn) — 0 


for any 71, f2, +++, Tv-1- By (10) this gives for any 7, 72, +*+, Tw-1 and 
any fixed qx, 


f ia, eeey gn )Ur; (qr) aie Ury (QN) dq ::: dqn-1 = 0. 


It follows by mathematical induction that 
(hn, a Ea) qv) =0 


for any gy , and identically in the qı , ---, qw- . In other words, this equa- 
tion is satisfied identically in all N variables, which proves our assertion. 
It is now easy to show that functions of the form (8), in which the num- 
bers rı, +-+, rw are related by (9), constitute a linear basis of the mani- 
fold Mz. In fact, let V = V(qi, ---, qw) be any eigenfunction of the 
operator X, corresponding to the eigenvalue E. By the established com- 
pleteness of the system of functions of the form (8) we can write 


y = DieU:, 


where the summation extends over the whole set of functions of the form 
(8). Use of the general rule for determining Fourier coefficients (II, §3) 
gives 


ci = (V, U;) (¢ = 1,2,---). 
The function U;, like V, is by Theorem 1, an eigenfunction of the opera- 
tor X. If the eigenvalue £ + E + ++: + Ey, corresponding to U;, 


is not equal to Æ, i.e., if it does not satisfy (9), then V and U; belong to 
different eigenvalues of this operator and are thus mutually orthogonal 
(II, §3), so that c; = 0. Hence, the function V is represented as a linear 
combination of functions of the form (8), in which the numbers r; are re- 
lated by (9). This is precisely what we were required to show. 

Therefore, the set of functions of the form (8), for which €, + Es + 
‘++ + £y = E, is a linear orthonormal basis of the manifold Mz. We shall 
henceforth always call these functions the fundamental eigenfunctions, and 
the states described by them the fundamental states of the system. The set 
of fundamental states of a system is uniquely defined by the choice of the 
original orthogonal set (7) and by the assignment of the number of par- 
ticles N and the energy F of the system. In particular, if the structure of 
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the constituent particles is known, then the number m of fundamental 
states of the system is a function of N and E, which we shall denote by 
Q(N, E). We call it the structure function of the system. This function, as 
we shall see, plays a leading role in our theory. Denoting the fundamental 
eigenfunctions corresponding to the energy level E by Ui, U2, ---, Un, 
we have for any physical quantity A in the case of complete statistics, 


<A> = [Q(N, E| By (QU, Uz). 


(This follows from §2.) 

We shall presently prove that the above linear basis, composed of fun- 
damental eigenfunctions, is especially convenient for the subsequent de- 
velopment of the theory. However, we return now to the consideration of 
symmetric statistics. 

In the expression (8) for a fundamental function U the indices of the 
variables qı , g2, --*, gv comprise the sequence of integers from 1 to N. 
We generate an arbitrary permutation ® among them, i.e., we transform 
the function U to the function 


eU = Ur (Giz) Urg (Gig) ea Ury (Gin )s 


where ù, t2, +++, îy are the numbers of the sequence 1, 2, ---, N, in an 
order defined by the permutation ®. The number of possible permutations 
is obviously equal to N'!. We now define S such that 


S= S(a,@; ++, Gy) = Die OU, 


where the summation extends over all permutations of the sequence 1, 2, 
--+, N that give different functions EU. It is obvious that S is a symmetric 
function of the variables qı, g2, +*+, gv and at the same time an eigen- 
function of the operator # corresponding to the eigenvalue E. 

This construction, which we have just carried out for one fundamental 
function U, can be carried out for each of the fundamental functions of 
our fundamental linear basis. From each construction we obtain a certain 
symmetric function S. Clearly, some of these functions can be identical. 
We denote by Sı, S2, ---, Ss the set of distinct eigenfunctions obtained 
in this way, and by Gz the set of all symmetric functions of the manifold 
Mz, i.e., all symmetric eigenfunctions of the operator 3 corresponding 
to the eigenvalue E. 

THEOREM 3. The functions Sı, S2, +--+, Ss constitute an orthogonal linear 
basis of the manifold Gz . 

Proof. 1. Suppose m = n, and 


(S) Sn = Lip PUn, Sa= Lig PUn. 
Then 
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(Sms Sn) = Lipp (CUm, Un); 


where the summation extends over all pairs ®, ©’ corresponding to permu- 
tations of the sequence 1, 2, ---, N which occur in (S). If any of the sealar 
products (@U,, , ®’U,,) are different from zero, then we must have CU, = 
oe’ U, , because CUm and 6’U, are fundamental functions, i.e., members 
of the orthogonal system of functions (8). From @U, = @U, it follows 
that 


Um = 010'U, = QU,, 


where Q = 10 is a permutation of the sequence 1, 2, ---, N. Thus, the 
function Um is obtained from a certain permutation Q on the function U, . 
This leads to the contradiction Sm = S,, from which it follows that 
(Sm, Sa) = O for m =Æ n, i.e., the functions Sı, S2, +++, Ss are pairwise 
orthogonal. 

2. Let S be any function of the manifold Gz (and hence also a function 
of the manifold Itz). By Theorem 2, 


(11) S= Da YUk, 


where the y are complex numbers, and the U, comprise the fundamental 
linear basis of the manifold Mz . Let some pair of functions U, and U; be 
related by U; = @U; , where @ is a permutation of the sequence 1, 2, ---, 
N. Then 


Y= (S, U1) A (5, EU). 
By the symmetry of the function S, we must have @S = S, and hence 
yı = (PS, PU;,). 


If one represents scalar products in the form of integrals, then (@S, @U;) 
obviously differs from (S, U+) only in the notation of variables in the in- 
tegral, so that 


yı = (S, Ux) = Y. 


Thus, in the sum (11), all functions U; of the form @U; have the same 
coefficient, namely yx. The sum of the corresponding group of terms on 
the right side of (11) is therefore 


Yk 2o CUr = ¥8;, 


where S; is one of the functions Sı, S2, +++, S.. Since the right side of 
(11) is composed entirely of such groups, 


S = 325.138; , 
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where the A; are complex numbers. Hence, the functions S; constitute a 
linear basis of the manifold Gz . This proves Theorem 3. 

When the particles of a system obey symmetric statistics we shall call 
the functions Sı, S2, ---, Ss (after normalization through multiplication 
by the appropriate normalizing factor) the fundamental eigenfunctions, 
and the states described by them the fundamental states of the system 
for the given energy level E. The number s of these fundamental states 
we shall denote, as before, by Q(N, E) and call it the structure function 
of the system. The microcanonical average of a physical quantity A, when 
the averaging is carried out only over the symmetric (normalized) eigen- 
functions, is expressed by the formula 


<A> = RN, EN`’ Eim (@S;, 85). 


We now turn to the case of antisymmetric statistics. For every function 
U of the form (8) we form the function 


A =A, p, g) = Le + OU, 


where the summation extends over all permutations of the sequence 1, 2, 
--+, N, and the sign + or — is used depending on the evenness or oddness 
of the permutation @. It is easy to see that the function A can be rewritten 
in the form of a determinant 


Ur, (qr) Ur, (qo) Ur, (qn) 
ia) ies Urli)  Urg (Qa) Ur, (qu) 
Uny (Gi)  Ury (ga) Ury (Qu) 


However, we shall hardly ever need this expression. We carry out the con- 
struction indicated above for each of the functions U of the form (8) and 
denote (after normalization) by A,, Az, ++ +, Aa the set of (antisymmetric) 
functions obtained in this way. The members of this set differ from one 
another and do not vanish. Let Xz be the manifold of all antisymmetric 
functions of the manifold Mz . Then the following theorem is valid. 

THEOREM 4. The functions A; , Az, -> +, Aa constitute an orthogonal linear 
basis of the manifold Ux . 

It is not necessary to prove this theorem because the proof is analogous 
to that of Theorem 3. When the system consists of particles that obey 
antisymmetric statistics, we shall call the (normalized) functions A: , A2, 
-++, Aq the fundamental eigenfunctions, and the states described by them 
the fundamental states of the system for the given energy level Æ. The 
number a of these fundamental states we again denote by Q(N, E) and call 
it the structure function of the system. For the microcanonical average of 
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a quantity M in this case, we average over only the antisymmetric eigen- 
functions. Hence, 


<A> = (O(N, E" Eg (@A;, Aj). 


The fundamental states of a system which are described by the anti- 
symmetric eigenfunctions A; , possess one extremely important character- 
istic: If the system is in such a state, then the states of any two of its par- 
ticles must differ from one another. In fact, if in the eigenfunction of such 
a state two of the indices r; are equal, then the determinant (12) will have 
two identical rows, and will therefore vanish. Thus, 


Al, Q, qv) = 0. 


This type of function cannot be normalized and does not describe any real 
state of the system. The above rule is usually called the Pauli principle. 

In the case of complete statistics, if the system is in a fundamental state, 
then the state of each of the particles composing the system is known. 
[The state of the ith particle is determined by the index r; in the expres- 
sion (8) for the fundamental state U.] In the case of symmetric or anti- 
symmetric statistics such a statement would not be meaningful. For ex- 
ample, let 


S= Lip eU 


be one of the fundamental symmetric eigenfunctions. In the different states 
@U of the state S, the first particle, for instance, is described by the dif- 
ferent functions u,;(q:), so that in the state S the first particle is not in 
any definite state. In the case of symmetric or antisymmetric statistics 
the only sensible question is how many particles are to be found in some 
particular state if we know the state of the overall system. We discuss 
this question in detail in the following section. 


§5. Occupation numbers. Basic expressions of the structure functions 


Suppose we are given a stationary state U of a system which obeys one 
of the three statistics. U, as usual, is an eigenfunction of the energy opera- 
tor 3, corresponding to a definite eigenvalue (energy level) E of this oper- 
ator. We can ask how many particles will be found in a state which is de- 
scribed by an eigenfunction u,(x) (corresponding to the energy level £, of 
& particle). The number a, = a,(U) of such particles is, generally, a phys- 
ical quantity whose value is not uniquely determined by the state U. The 
only characteristic determined by the state U is the distribution law of 
the number a, and, in particular, its mathematical expectation Eva, . But 
tn every fundamental state of the system the value of the number a, is uniquely 
determined. So, if we are concerned with complete statistics, then each of 
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the fundamental eigenfunctions, according to formula (8) of §4, has the 
form 


(8) U = Ur; (G1) Urg(Q2) `+ Urey (Qu): 


This shows directly which of its possible states each particle occupies. The 
number a, of particles in the state u(x) is simply the number of indices 
ri which equal r in expression (8). Clearly, this number has a definite value 
for each fundamental state. We are easily led to the same result in the 
case of the other two statistics. In these latter cases the fundamental func- 
tions are linear combinations of functions of the form (8), where in all the 
terms the number of indices r; , equal to any given number, is the same, 
and hence, the number a, is uniquely determined. 

If the state U is not fundamental, then it can be represented in the form 


U = ie aiUi, 


where the a; are complex numbers. (For the purposes of normalization 
we shall suppose that | a|? + | œ|? + --- + | ar? = 1.) The U; are the 
fundamental states and among them there will be fundamental states with 
different values of the number a, , so that it is impossible to speak of any 
definite value of this number in the state U. One can only assert that if 
a,” is the value of the numbera, in thefundamental state U; , then >. ;| a; |” 
is the probability that a, will equal a,® in the state U, where the summa- 


tion extends over all j for which a,? = a,®. Thus, in particular, 


Eva, = 2a | aia, ®. 


(We do not prove this statement as it will not be of further interest to us.) 


In statistical physics the numbers a, (r = 1, 2, --+), which repre- 
sent the numbers of particles in the states described by eigenfunctions 
w(x) (r = 1, 2, ---), are usually called “occupation numbers”. It is 


obvious that if a system is composed of N particles and has an energy Æ, 
then, independently of the kind of statistics obeyed by the system, we must 
have 


(K) Daia =N, Da ae, = E. 


The occupation numbers play a very important part in any description 
of statistical physics. The fact that for fundamental states these numbers 
take on uniquely determined values, makes the systems of fundamental 
eigenfunctions especially convenient linear bases of the manifolds over 
which averages must extend. (The particular manifold used, of course, 
depends on the type of statistics obeyed by the system.) 

We have seen that in each of the three statistics, a definite set of occu- 
pation numbers a, @2, ---, @, +++, which satisfy relation (K), corre- 
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sponds to each fundamental state of the system. Now we pose the converse 
problem: Suppose we are given a set of non-negative integers a, (r = 1, 2, 
..-), satisfying relation (K); then do fundamental states of the system 
exist which have the numbers a, for occupation numbers, and, if so, how 
many of these fundamental states are there? 

This problem, which is very important for the development of the sta- 
tistical theory, is solved in different ways for the three statistical schemes. 

1. Complete Statistics. Let a, , a2, +*+, ap, °°: be non-negative integers, 
related by (K). In order that a fundamental state (8) have the numbers 
a, for its occupation numbers, it is necessary and sufficient that the func- 
tion u(x) occur a total of a, times among the functions which are the fac- 
tors of U. Since X za a, = N, it is obviously always possible to realize 
this requirement. The number of ways in which this can be accomplished 
is equal to the number of permutations of the set of N elements composed 
of subsets of a; , a2, --- identical elements, i.e., it is equal to 


N'/a;! ao! --- a,!. 


Thus, in the case of complete statistics, to each solution (in terms of non- 
negative integers a,) of the system of equations (K), there correspond 


N VY NH a,! 


fundamental states having the numbers a, for their occupation numbers. 
2. Symmetric Statistics. In the expression 


S= Lee’ 


of the fundamental states for the case of symmetric statistics the terms 
of the right side differ from one another only by the order of the indices 
ik in the product 


(18) Ur, (Gi, ) Urs (Gig) A Ury (iy) 


The indices r; are the same in all the terms, and hence the occupation num- 
bers are also the same for all the terms. Clearly the converse also holds: 
All products of the form (13), which have a definite set of occupation 
numbers, can differ from one another only in the order of the indices 7; , 
and are obtained from one another by the corresponding permutation of 
these indices. This means that all such products constitute a single sym- 
metric fundamental function S, which is thus uniquely determined by the 
given set of occupation numbers: In the case of symmetric statistics to each 
solution (in terms of non-negative integers a,) of the system of equations 
(K), there corresponds one fundamental state, having the numbers a, for 
its occupation numbers. 

3. Antisymmetric Statistics. In this case for the fundamental functions, 
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we have the expression 
A= Vip + ou, 


where the + sign is to be used for even, and the — sign for odd permuta- 
tions @. Here too, the terms of the right side comprise the whole set of 
functions of the form (13) which have the same set of occupation numbers. 
Hence each solution of the system (K) cannot correspond to more than 
one antisymmetric state. However, we do not get a normalizable eigen- 
function A for every set of occupation numbers — in some cases it turns 
out to vanish identically. Consider an arbitrary set of occupation numbers 
a, (r = 1, 2, ---). If each a, is either 0 or 1, then in the product (13) all 
of the indices r, are different, and so two such products, obtained from one 
another by some permutation of the indices 7, , cannot be the same. Thus 
in the sum >>, + @U, it is impossible for identical terms to enter, and A 
will be a normalizable eigenfunction. If, among the numbersa, , there should 
be one greater than unity, then among the indices r, in the product (13) 
some will be equal. Because the order of enumeration is immaterial, let 
us suppose that rı = rz. Then the product (13) actually coincides with 
the product 


Ur, (Giz Jur (Qi; ) are Ury (Vin); 


which is obtained from it by permutation of the indices 7 and 7 . Because 
each of these two products is obtained from the other by a simple trans- 
position, they will have opposite signs in the sum which constitutes A, and 
will cancel one another. Since such a cancelling partner can be found for 
every member of this sum, A = 0. 

Thus, in the case of antisymmetric statistics, to each solution (in terms of 
non-negative integers a,) of the system of equations (K), there corresponds: 
1) one fundamental state, having the numbers a, for its occupation numbers, 
if all a, < 1, or 2) no such states if there is an a, > 1. It will be convenient 
for us to have an expression for the number of fundamental states corre- 
sponding to a given set of occupation numbers which will apply to all three 
statistical schemes. For this purpose we put, for n = 1, 2, =>, 


Cin) = n! for complete statistics, 
I for symmetric or antisymmetric statistics; 
and 
f1 /n!} for complete statistics, 


for symmetric statistics, 


for antisymmetric statistics. 


Then, the number of fundamental states corresponding to a given set 
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of numbers a, > 0, satisfying the system of equations (K), is 
C(N) [TP yla), 


for any of the three statistics. We obtain from this, for the total number 
of fundamental states of the system corresponding to a given energy level 
E, the expression 


(14) 2(N, E) = C(N) Dw I ylar), 
where the summation on the right side extends over all systems of integers 
Gi, @2, >*t, 0r, +> (a, > 0), satisfying equation (K). 


As we have already remarked, the structure function 0(N, E) plays a 
very important part in our subsequent development. Hence we shall have 
to study it with considerable care. Formula (14), which relates the func- 
tion Q(N, E) to the solutions of the system of equations (K), will serve 
as a starting point for our investigation. 

For any statistics the occupation numbers a, are uniquely determined 
for each fundamental state and can be considered as functions of this state. 
Hence, for any energy level E they have definite (microcanonical) mean 
values <a,>. We shall see further on that these mean values of the occu- 
pation numbers are of fundamental importance for all the computational 
formulas of statistical physics. The determination of their asymptotic 
expressions therefore constitutes one of the basic mathematical problems 
of the theory and the following two chapters will be devoted mainly to this 
problem. However, we pause here to discuss certain general considerations 
which, to a large extent, clarify the importance of the numbers <a,> 
in the asymptotic computations of statistical physics. 

A large fraction of the most important physical quantities A, which 
characterize the state of a system composed of N identical particles, have 
a certain special form: Such a quantity M is a sum of N quantities M , M2, 
+++, My, each of which depends (in the very same way) only on the state 
of the particle with the corresponding number. Therefore, in the simplest 
case, when the state of the system is described by the function 


U = Ur, (Qi) Ure (g2) ee Ury (qu), 


the quantity A depends only on the number r , i.e., on the index of the 
eigenfunction u,,(q,) which describes the state of the kth particle. To the 
decomposition 


A = M + M +- + Av 
of the quantity % there corresponds, of course, the decomposition 
Q= Qtt- + Gy 


of the corresponding operator. In this latter decomposition the operator 
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Q: (k = 1, 2, +-+, N) acts only on the eigenfunction of the kth particle, 
so that 


QU = un (q) + Ury 1 (Qe-1) [Cuter (Ge) Ure, 1 (Ges) ae Ury (Qu). 


Quantities A of the above type will be called sum functions. We now 
consider how to construct the mean values of these sum functions. 

If, as we have assumed, the dependence of the quantity As on the state 
of the kth particle is the same for all k, then, in particular, the mathematical 
expectation of the quantity A, in some state u,(q.) of the kth particle does 
not depend on k, but only on r. We denote this mathematical expectation 
by A,. Thus the set of numbers à, depends only on the structure of the 
particles and on the choice of the elementary quantity A+ . These numbers 
are not related to the system and retain their meaning and their values 
even when we consider an individual particle instead of a system. 

Now let our system be in a fundamental state U with occupation num- 
bers a; , @2, --°, Qr, ++ -. Then, first of all, 


(15) EvA = doh Ev. 


It is obvious that Ev% = `, , if the kth particle is in the state u,(q.). 
Therefore, in the sum (15) the term à, appears as many times as there are 
particles in state u,(q:), i.e., a, times. Whence, 


EvA = Doria, 
and consequently, 
(16) <A> = <E A> = oR <a>). 


This simple relation shows that a knowledge of the mean values of the 
occupation numbers allows us to write down immediately the mean value 
of any sum function 2%. (It is necessary to recall that the numbers A, do 
not depend on the state of the system and they can be calculated once and 
for all for particles of a given structure and for a given choice of the quan- 
tity Ax .) This use of the occupation numbers is the chief reason that the 
first problem considered in every statistical theory is that of finding con- 
venient asymptotic expressions for their mean values. 


§6. On the suitability of microcanonical averages 


In §1 we spoke in some detail about the fact that physical quantities, 
which characterize the state of a system from the phenomenological point 
of view, must, in the statistical theory, depend symmetrically on all of the 
particles composing the system. The mean value of such a quantity, to 
really exist, i.e., to really have a physical meaning in a given experiment, 
must have a value close to that one of the values given by the statistical 
theory which is appropriate for the particular experiment. We pointed out 
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jn detail in §1 that for this purpose the mean values given by our theory 
must be suitable; that is, in the overwhelming majority of those states over 
which averages are extended a quantity must assume values extremely 
close to its mean value. 

For a given total energy E, we have defined the mean value of the quan- 
tity A by 


(17) <A> = <EyA> = QN, E)! Mi Eo, M, 


where m = Q(N, E) is the number of fundamental states U, of the system 
for the energy level E, and where the summation extends over all such 
states. The system, of course, can obey any of the three types of statistics 
we have considered. 

The simplest and most important type of quantity A which depends 
symmetrically on all the particles composing the system is obviously the 
sum function considered at the end of the preceding section. This function 
is a sum of N quantities, each depending on the state of a single particle 
and all having the same functional dependence. In the present section we 
shall always write 


Y= 7 DA 


for such a sum function. Because the dependence of the quantity A; on the 
state of the ith particle is the same for all 7, the mean value <Y;> = a 
does not depend on 7. Obviously we have 


<A> = Na, 


so that, as long as a = 0, we can always think of <A> as a large number 
(of the order of N). We can therefore think of the value of the quantity 
Was near <A>, if 


(18) |A — Nal < N, 


where «is a small positive number; because if (18) holds, the relative error 
in the equation Y ~~ <A> is smaller than | e/a |. 

We recall that if the given system is in a fundamental state, we cannot 
say whether inequality (18) is satisfied or not, because in general the quan- 
tity A is not uniquely determined in the state U. If the system is in state 
U, all that is determined is the probability 


Py{ | A — Na | < N} 


that inequality (18) is satisfied. 

Now let M; denote the measure (in accordance with the definition of 
measure given in §2) of the set of all eigenfunctions U of the manifold 
Me for which 


Pof | A — Nal < N} < 1— ô; 
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or, equivalently, 
(19) Puf | X — Na|> N} >ô, 


where ô is any (small) positive number. Then assuming U = Ð k= aU, , 
we have 


f Pu{|% — Na| > e N} dS* > ôMs, 
s* 


where S* is the complex sphere (2) of §2. Hence, 


M, < (1/8) A Pol A — Na| > N} ds. 


But, by the Chebyshev inequality, 
Puf | X — Na | > N} < Ef (A — aN)}/eN’; 


whence, 


M, < (1/8¢N*) | Evi (A — aN") as* 

(20) s 
= <(% — aN)’>/beN’. 
M; denotes the measure of the set of those stationary states U of the mani- 
fold Me for which inequality (19) holds. For all other states we can expect, 
with a probability exceeding 1 — ô, that inequality (18) will hold, i.e., 
that the approximate equality A = <A> will hold. Thus, if M, becomes 
small for small ¢ and 6, then in a large majority of the states U we can 
expect, with overwhelming probability, that the quantity X has a value 
near <A>. This is just what we call the suitability of the mean value 
<A>. As inequality (20) shows, this suitability will be guaranteed if, 
for small «e and ô, the quantity 
<(A — aN)’>/seN” 

remains sufficiently small. 

For brevity we denote the mean value < (X — aN)’> by D(A) and 
call it the microcanonical dispersion of the quantity A. In the above cal- 
culation we have nowhere assumed that % was a sum function. Now we 
determine the special form of the microcanonical dispersion which is ap- 
propriate for sum functions. Obviously, we have 


D(A) = <(A — aN)’ > 
(21) = <> — aN’ 
<(Pim A) > — N’. 
But for any state U of the system 
(22) Eo(W) = Pim Eol AF) + 22i Eo( WA). 


7 > 
i<j 


I 
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Let the fundamental state U, correspond to the set of occupation numbers 
a,(Ux) = a, (r = 1, 2, ---). Then in the state U, the system contains 
4a,(a, — 1) pairs of particles with both members in the state u(x) (r = 
1, 2, +++) and, for r # s, a,a, pairs of particles with one member in the 
state u(x) and the other in the state u,(z). 

Now let A, and u, denote, respectively, the mathematical expectations 
of the quantities Y%; and Až, when the ith particle is in the state u,(z) 
(r = 1, 2, ---). These quantities do not depend on 7 and are determined 
by the structure of the particles and the choice of the quantities YU; . For- 
mula (22) gives 


Ev, (2) = ot Arr + PPS 3a,( a, oi 1A, + 22 ores ArAsArAs , 


where it is understood that a, = a, (U+), as = a;(U;). For the microcanon- 
ical average of the quantity we therefore find, by (17), 


<> = (1/m) ei Eo, 
= baie <G,> Hr + Pa <a>, = Pa <a >N 


(23) +2 Dei <lr >A 
r <8 


= pe (ur i, `r) <a4,> + poe Jua Are Karl: >. 


(In all of the above we assume, of course, that all the series obtained are 
absolutely convergent; for this, it suffices to assume that the numbers 
| A-| and p, do not increase too rapidly with increasing r.) Since, on the 
other hand, by formula (16) of §5, 


<A> = Na = Jmn N <a>, 
we have that 
Na = O21 Daa AA <a> <l>; 
and formulas (23) and (21) give 
oa Doma (Hr — dr) <a> 
+ Dra Vea ral<aa> — <a> <a>] 


This is the desired expression for the microcanonical dispersion of sum 
functions. To estimate the quantity D(A), we must know the mean values 
<a> of the occupation numbers and the mean values <a,a,> of their 
pairwise products. We have seen that the suitability of the mean value 
<A> is guaranteed if the quantity 


D(X) /5eN* 
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is sufficiently small. This requires that D(%)/N’ be negligibly small for 
small e and ô. In Chapters IV and V we shall see that in actual physical 
cases, for N — œ, the quantity D(A) is infinitely large of order N. From 
this it follows that D(2)/N’ is of order 1/N, i.e., that it actually becomes 
negligibly small. 

We shall discuss in somewhat more detail the manner in which the suit- 
ability of mean values is connected with their experimental verification. 
Assume that the total energy of the system is E, but that no other infor- 
mation regarding its state is known. We then make a measurement of the 
quantity A and want to know the probability 


P{ |X — aN | < N} 


that the value obtained for this quantity will lie between (a — ¢)N and 
(a + «e)N. However, our question is meaningless when phrased in this 
manner, because the theory gives the probability distribution of the quan- 
tity A only if the state U, in which the system is found, is known. We know 
only its energy level Æ, which can correspond to an infinite set of different 
states U, and we have no way of knowing in which of these the system 
will be found. In order that our problem have a definite meaning, we must 
know the relative frequency with which the various states U of the mani- 
fold Me will occur, under the established conditions of the experiment, 
i.e., we must know the probability law of the states U, appropriate to the 
manner of experimentation. We suppose, for simplicity, that this law has 
a definite probability density p(U). It is immediately clear that this prob- 
ability density p(U) may depend in an essential manner on the conditions 
of the experiment: for example, upon whether we measure the quantity 
a large number of times in one system with a definite interval of time 
between measurements, or in a large number of systems simultaneously. 
(The time interval in the former case must be sufficiently great so that 
the result of any measurement will not be influenced by the preceding one.) 
However, if the function p(U) is known, then by the formula for total 
probability, 


(25) P{|% — aN | > N} = i COPS A — aN | > N} do, 


where dw is the volume element of the manifold Mgs for the measure we 
defined on this manifold in §2. Let M; denote, as before, the measure of 
the set of those states U for which 

Py = Puf |A — aN | > N} >ô. 


Then, assuming that the function p(U) is bounded on Mz , and denoting 
by R its upper bound, we conclude from (25) that 
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PIA- aN] >N} = f  (U)Podw+{ — p(U)Podw 
Puss Py>s 


<a f oU) do + R | dw = è + RM; ; 
Mge Py>ô 


whence, by (20), 


(26) P{ | A — aN | > N} <6 + RD(U)/5EN’. 
If, as is the case in the great majority of physical problems, the quantity 
D(A)/ N? 


approaches zero as N — œ, then for sufficiently small 6 and e, and suffi- 
ciently large N, the right side of (26) becomes arbitrarily small. This means 
that for a sufficiently large number of particles, the approximate equality 
A = Na will agree with experiment with a relative error less than | e/a | with 
a probability arbitrarily close to unity. 

This conclusion, which gives a very satisfactory answer to the question 
of the real significance of our averages, is possible only if the probability 
distribution of the stationary states U has a bounded density p(U). (In 
other respects this probability law can be arbitrary, and this constitutes 
the chief value of the results obtained.) By complicating the calculation 
somewhat, it could be shown that we arrive at the same result under the 
broader hypothesis that this law is absolutely continuous with respect to 
the measure introduced on the manifold Mz. However, if the probability 
law is not absolutely continuous with respect to this measure, then all of 
our calculations lose their basis, and the averages obtained by our method 
can lead to results which diverge sharply from experiment. We investigated 
a striking example of such a situation in §3 in considering the transition 
from complete statistics to symmetric (or antisymmetric) statistics. In 
general, the manifold Gs of symmetric functions of the manifold Me has 
Measure zero in the measure assigned by us (like a linear manifold of a 
smaller number of dimensions). Thus, if we consider a system obeying 
symmetric statistics, then the only states U which are really possible are 
those belonging to Gz , and hence 


[ow dw = 1. 


This shows that the law p(U) is not absolutely continuous with respect to 
our original measure; and, to make our conclusions agree with experiment, 
we had to replace the first method of averaging by a new one. This con- 
stituted the transition from complete statistics to symmetric (or anti- 
symmetric) statistics. 


Chapter IV 
FOUNDATIONS OF THE STATISTICS OF PHOTONS 


§1. Distinctive characteristics of the statistics of photons 


In this chapter we shall study systems composed of photons (light 
quanta). Such systems possess certain distinctive properties which make 
them fundamentally different from systems composed of material par- 
ticles. Even though all the methods we develop here are also applicable to 
systems of other types, the statistics of photons differ appreciably from 
the statistics of material particles and demand a separate study. We begin 
with the statistics of photons because they are considerably simpler than 
the statistics of material particles. Hence, the fundamental ideas of our 
method will not be burdened with too heavy a formal apparatus, and will 
emerge more clearly and be more easily understood. Then, after firmly 
grasping these ideas in a relatively simple mathematical context, the reader 
will be able to follow them without difficulty in the more complicated and 
formal relationships of the problems of the statistics of material particles. 

The basic characteristic which distinguishes and simplifies the statistics 
of photons compared to the statistics of material particles is the fact that 
the number N of photons, comprising a system with a fixed total energy 
E, is not fixed, but changes from state to state. Thus, the set of all possible 
states of a system, for a given total energy E, is described by a set of eigen- 
functions (belonging to the eigenvalue E), not of just one operator 3¢, but 
of the energy operators of all possible systems, composed of any number of 
photons. 

On the other hand, photons always obey symmetric statistics. Therefore, 
if £1, €2, +, €, +- is the sequence of possible photon energy levels for 
a given set of conditions (as usual, O < 4. L & < > < & < +--+, and 
each level is repeated a number of times equal to its multiplicity), then 
the number Q(N, E) of linearly independent (symmetric) eigenfunctions 
of the operator 5, for a given level E and a given number of photons N, 
is equal to (III, §5) the number of solutions in terms of integral a, > 0 
of the system of equations 


(1) Doreidr = N, Bms = E. 


If we construct a linear basis of Q(N, E) eigenfunctions for every N and 
then combine all these bases together, we get a system of 


Q(E) = Dive: Q(N, E) 


(symmetric) eigenfunctions (of different operators), belonging to the same 
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eigenvalue Æ. Each of these functions describes a possible state of our sys- 
tem when the system has a total energy E. Conversely, every such state 
is described by a linear combination of these functions. It is obvious that 
the number Q( Æ) is simply equal to the number of solutions of the equa- 
tion 


eae = E 


with integral a, > 0, since the first of equations (1) does not apply when 
N is arbitrary. In analogy with III, §4, we shall call Q(£) the structure 
function of our system of photons. 

When the number N of particles is fixed, we saw (III, §2) that the micro- 
canonical average of a phase function f( U) would be the arithmetic average 
of its values for the Q(N, J’) functions of our linear basis. In other words, 
we agreed to use these O(N, E) functions with equal weights for averaging. 

Now since the number N of photons has no fixed value, we naturally 
take as a basis for averaging, the system of Q( E) functions just constructed. 
For the microcanonical average of a phase function f(U) (for a given 
value Æ) we shall take the arithmetic average of its values for these Q(E) 
functions. This principle of averaging is, of course, a new and arbitrary 
agreement: Functions which pertain to different N (and so depend on a 
different number of variables) were never equal to one another in statistical 
weight; now we ascribe the same weight to all of them, independent of the 
value of N. In §6 we shall strive to justify this new and arbitrary assump- 
tion by showing that for the cases of interest in statistical physics the re- 
sults obtained by our method of averaging are independent. of this method. 
Hence any other principle of averaging (within broad limits) would lead 
us to the same conclusions. 


§2. Occupation numbers and their mean values 


As we saw in III, §4, the symmetric cigenfunctions which constitute our 
linear basis can be chosen from the set of so-called “fundamental” func- 
tions of the system. In a state of the system described by such a function, 
the number a, of particles in a state with energy level e, is fully determined. 
In other words, in each of the 2(F') fundamental states comprising our 
chosen basis we have a completely determined set of “occupation numbers” 
a, (r = 1, 2, ---) which are, of course, always subject. to the requirement 
that 


(2) eae, = E. 


The microcanonical averages of the occupation numbers a, are therefore 
given by the formula 


(3) <a> = (QE) Er a (U), 
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where the summation extends over all the fundamental functions U of our 
basis, and a,( U) denotes the quantity a, in the state described by the func- 
tion U. 

A basic problem in the statistical thermodynamics of any system is to 
determine the distribution of the energy of the system among its compo- 
nent particles. In our theory we take the microcanonical average <a,> 
as the mean number of particles in an energy level £, . Therefore, the esti- 
mation of these quantities <a,> is a primary problem. Moreover, as we 
saw in ITI, §5, knowing the quantities <a,> allows us to write down imme- 
diately the microcanonical average of any sum function. It is chiefly by 
means of these sum functions that the state of a system is characterized 
in statistical physics. Further, as we saw in III, §6, to determine the suita- 
bility of the microcanonical averages (i.e., to justify the method of micro- 
canonical averaging) in the case of sum functions it is also necessary to 
know the quantities <a,a,> (r,s = 1,2, ---), which represent the micro- 
canonical averages of the products of pairs of occupation numbers. There- 
fore, any mathematical apparatus which is employed in statistical physics 
must provide convenient analytical expressions for the quantities <a,> 
and <a,a,>. This statement pertains equally to particles of any type and 
to any of the three basic statistical schemes. In the following sections we 
shall look at the simplest example of photons to see how these expressions 
can be obtained. 

Our first step will be to determine an elementary expression for the 
numbers <a,> and <a,a,> in terms of the structure function Q(/), which 
is the number of solutions of equation (2) with integral a, > 0. For this 
purpose we note first of all that formula (3) can obviously be rewritten in 
the form 


(4) <a,> = [AE Ea kA, 


where A, denotes the number of those fundamental states (i.e., fundamen- 
tal functions of our basis) in which a, = k (k = 1,2, +++). Now we denote 
by M; (k = 1,2, --+) the number of those fundamental states in which 
a, = k, so that. 


Ar = M; — Miga (hk = 1,2, +--+). 


M: is obviously equal to the number of those solutions of equation (2) in 
which a, > k; or, equivalently, to the number of solutions in terms of 
integral b, > 0 of the equation 

bey + eee + bea + (k + be, + Dre r€r4i +--+ = E; 


or 


DMR be; = B — ke,. 
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This means that M, = Q(E — ke,) (k = 1, 2, -+-), and hence that 
A, = QE — ke) — QUE — (k + 1)e] (k=1,2 
Therefore, formula (4) gives 
<a> = AE) E kE — ke) — Q[E — (k + 1)e}. 
By an elementary transformation due to Abel, we then find that 
(5) <a,> = Pra U(E — ke,)/Q(E). 


This simple formula will be the starting point of our further calculations. 

We turn now to the derivation of an analogous expression for <a,a,>. 
We suppose first that r = s. Let Axı denote the number of solutions of 
equation (2) in which a, = k, a, = l. Then, obviously, 


(6) Lal: > = IKEN oes Diet kl Mur 


On the other hand, if we denote by Mx: the number of solutions of (2) in 
which a, > k, a. > L, then, as is easily verified, 


(7) Agi = Mer — Mryn — Mr, + Megan. 


Further, in analogy with our previous calculation, we note that My; can 
be represented as the number of solutions in terms of integral b; > 0 of 
the equation 


bies bate p> ce (k H b)er ses Po bes + ooe I E; 
or 
> fu bie; = E — ke, — les. 
Hence, 
Mir = Q(E — ke, — les). 
Therefore, formulas (6) and (7) give 
<at> = [UAE E Do kU Q(E — ke, — les) 
— OF — (k + lye, — les] — QUE — ke, — (0 + 1)e] 
+ QE — (k + 1)e — (l+ Des}; 
and, after twice applying the Abel transformation, we find 
(8) <at> = Jka Dm UE — ke, — le.) /Q(E). 


Finally, in the case r = s, the same argument applies in finding the re- 
quired expression for the quantity <a, >. Retaining the former notation, 
we have 
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<a> = [Q(B]? ota A = [AEN 2a kM, — Mays) 
= [XE Ek (2k — 1)M:, 


where the last expression is obtained from the penultimate one by the 
Abel transformation. Since we found that M; = Q(E — ke,) (k = 1,2,---), 
we obtain 


(9) <ar> = Dine (2k — 1)Q(E — ke,)/Q(E). 


This completes the first stage of our investigation. Formulas (5), (8) 
and (9) give us expressions for the numbers <a,>, <a,a;> (r ¥ s) and 
<a,’> in terms of the structure function Q and these in turn reduce our 
problem to that of determining a convenient analytic expression for the 
structure function. 


§3. Reduction to a problem of the theory of probability 


The second step in our present investigation will be to reduce the problem 
of finding an asymptotic expression for the structure function Q of our 
system, to a comparatively simple problem of the theory of probability. 
This step has great methodological significance because it establishes a 
bridge between the basic problems of statistical mechanics and the limit 
problems of the theory of probability. This bridge enables us to use the 
well-developed analytical methods of the modern theory of probability to 
solve problems in statistical physics. l 

However, before making this reduction, we must study in some detail 
the sequence 


(10) E eer eee 


of possible energy levels of each of our photons. [In the sequence (10) each 
level is repeated a number of times equal to its multiplicity (degree of 
degeneracy ).] First let our “photon gas” be enclosed in a vessel of unit 
volume. According to quantum physics, on the average, the number gx of 
possible photon energy levels contained between k and k + 1 approaches 
infinity as k — œ. (We shall find the law which determines this increase 
in §5 of the present chapter.) In §2 of the Introduction, we agreed to 
approximate all the £, by integers. Therefore, we suppose that the integer 
k occurs g: times in the sequence (10). This is the situation when the vol- 
ume occupied by our system is unity. If this volume equals V (we always 
consider V to be a large integer), then, under very general conditions con- 
cerning the shape of the vessel, it can be shown from quantum physics that 
for sufficiently large k the number of possible energy levels in any (not too 
small) interval including / is increased by a factor V. We can assume this 
to be true for all & (small and large) without appreciably affecting the 
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results in most physical problems. Therefore, we suppose that every integer 
k is repeated Vg, times in the sequence (10). In particular, this means that 
the sequence (10) and the structure function Q(#) depend strongly on 
the volume V occupied by our system. However, we must note that the 
asymptotic formulas which we shall derive in the following sections, are 
based on the assumption that E and V become infinitely large while main- 
taining some constant ratio (constant energy density E/V). Practically, 
this means that for the unit of energy we take approximately the average 
energy of an individual photon, so that E becomes a very large quantity 
(of the order of the average number of photons). For the volume we choose 
a unit so small that, on the average, there will be only a small number of 
photons per unit volume. The number V will then also be of the order of 
the average number of photons. 

Finally, we make the following remark which will prove important later: 
Suppose we have any function f(x) for which the series 


(11) Dor f(e) 
converges absolutely. Then from the preceding discussion it follows that 
(12) Doma fle) = V Eim gfh). 


Since the sum on the right side of this equation is independent of V, every 
quantity of the form (11) in our asymptotic formulas must be thought of as 
infinitely large, proportional to V. 

Now we proceed to establish a probabilistic expression for the structure 
function Q. Let 8 be a positive number, for the moment completely ar- 
bitrary. We consider random variables whose possible values are non-nega- 
tive integral multiples of some fixed integer k, i.e., the numbers 


0, k, 2k, ---, nk, --- 
Let the probability that the random variable assume the value nk be 
eE e ve im = (1 — ee h™* (mn = 0,1, °°-). 


We denote this distribution by p(x) and consider the sum of an infinite 
series of mutually independent random variables, of which g» variables are 
distributed according to p(x) (k = 1, 2, ---). We shall show that this 
series converges with probability 1. For a variable distributed according 
to p(x), the probability that it is different from zero is e ™. Therefore, 
the probability that at least one of the g, terms of our series, which are 
distributed according to px (2), is different from zero does not exceed g.e™*. 
But the numbers g, for most problems in quantum physics are such that 


the series 
—Bk 
Dei gre j 
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converges for all 8 > 0. Thus, with a probability as near as we please to 
unity, we can expect that all the terms of our series, beginning with the 
nth, will vanish if the number n is sufficiently large. But a series containing 
only a finite number of non-zero terms is trivially convergent. It therefore 
follows that the probability of convergence of our series is arbitrarily close 
to unity and hence equals 1, q.e.d. 

The sum of our series is therefore a random variable whose distribution 
we denote by P(x). Let Sy stand for the sum of V mutually independent 
random variables 


Sy = 2i X: , 


each of which has the distribution P(x). We seek the distribution of the 
quantity Sy, i.e., the probability 


P(Sy = q) 


that Sy is equal to a given non-negative integer q. 

The quantity Sy is the sum of V mutually independent random variables 
X;, each of which, being distributed according to P(x), can in turn be 
considered as the sum of an infinite series of mutually independent random 
variables, among which there are gẹ, variables distributed according to 
p(x). Thus we can consider the quantity Sy to be the sum of an infinite 
series of mutually independent random variables, among which there are 
Vg. variables distributed according to p(x). We shall denote these quan- 
tities by zı (1 < 1 < Vgx), so that 


Sy = ei pe Tkl; 


where the zx: obey the distribution p,(x), and all the x; are mutually inde- 
pendent. Hence, in order that Sy = q, it is necessary that 


(13) Dea Ditu = q. 


Since the random variable 2,; obeys the distribution p(x), it can assume 
only values of the form knr, where ni > O is an integer. Therefore, for 
the realization of equation (13), the probability of which we require, the 
system of values kny, assumed by the variables x,; must be such that 


(K) Dori pa Kner = q. 

Because of the mutual independence of the variables xı the probability 
of their assuming the set of values 2.2. = knyı (k = 1, 2, =- 51 = 1, 2, 
+++, Vg) is 


Ie [1% Plen = kn) = TPs Tr a — ee 
= Jna — Tr ey 
= n -— eo)", 
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where Z = Dif >.)2t kna. The probability of relation (13) is equal to 
a sum of probabilities of the form just written down, extended over all 
sets of values of the variables x+, satisfying condition (K). This sum can 
be written as 


Dow (Tea (2 — eo) He” = (TT a — e eo 1. 
The first factor 
Me a - &)™ 
is a completely determined function of the parameters 8 and V. Putting 
Mi (1 — e") = 0(8), 
we have 
Ie (1 - 6%)" = 8 


The last factor È r) 1 is the number of solutions of equation (K) with 
integral nų, > 0. It is easy to see that equation (K) differs only in notation 
from the equation 


(14) a Qr, = q. 


In fact, the number of levels z, equal to k in equation (14) is Vg, , and the 
coefficients a, of these levels can be denoted instead by na: (1 < L < Vg). 
This then transforms equation (14) into equation (K). Thus, 


dw 1 = q), 


and we find 

P(Sy = g) = {8(6) a) , 
whence 
(15) Ug) = {P(B PSr = q). 


This is the desired expression for the structure function. (It shows, in 
particular, that the possible values of the quantity Sy are the possible 
energy levels of the system. This could easily be seen directly from the 
definition of Sy .) We see that 9(q) is very simply expressed in terms of the 
distribution P(S, = q). Since we defined the random variable Sy as the 
sum of a very large number V of identically distributed and mutually in- 
dependent random variables, we can use the highly developed analytical 
apparatus of the theory of probability to calculate approximately the proba- 
bility P(Sy = q). Our problem is not complicated by the presence in for- 
mula (15) of the factor {6(8)} (which is independent of q), because 
formulas (5), (8) and (9) of the preceding section contain only ratios of 
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structure functions of different arguments; hence this factor will cancel 
completely. 

We note, finally, that the parameter 6 in formula (15) can have any 
positive value. We shall choose its value to give the asymptotic formulas 
the simplest possible form. 


§4. Application of a limit theorem of the theory of probability 


We now proceed to the third and final step in our study — the applica- 
tion of the local limit theorem of the theory of probability proved in I, §4. 

It is first necessary to choose the value of the parameter 8. To this end 
we prove the following auxiliary proposition: 

Lemma. For any positive number C, the equation 


din #(8)/dB + C =0 


has a unique positive root. 
Let us consider the function 


®.(8) = e*&(B). 


It is obvious that (8) and $-(8) approach infinity as 8 — 0. On the other 
hand, since #(£) is always greater than unity, 


@,(8) > e? 


and @,(8) also approaches infinity as 8 — œ. Thus, the function &#¢(8) and 
the function In @¢(8) approach infinity both for 8 — 0 and for 8 > œ. 
However, 


In 6-(8) = CB + In (8), 
and hence, 
Pn &-(8)/de’ = dm &(8)/ds’ = Dli kge */(1 — F > 0. 


Therefore, the function In @¢(8) is convex on the entire half-line (0, +œ). 
Combining these properties, we see that 


d In $o(8)/d8 = C + d In &(8)/dp 


vanishes at precisely one point of the half-line. This proves the lemma. 
Let our “photon gas” have energy E and volume V. Then we set 8 equal 
to the root (which exists and is unique by our Jemma) of the equation 


(16) dln &(8)/d8 + E/V = 0. 


Since our asymptotic formulas, as already noted, will be derived under the 
assumption that E and V are increasing without bound, but with the ratio 
E/V remaining constant, we can consider 8 to be constant in these formu- 
las. 
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The quantity Sv , for whose distribution we seek an asymptotic expres- 
sion, is the sum of V mutually independent random variables 


Xj (l<i<V), 


each of which is distributed according to P(x). This distribution (which, 
by the way, is independent of Æ and V) is the distribution of the sum 


X: = Dif > tes ta, 


where the variables z+, are distributed according to p(x). The mathe- 
matical expectation and the dispersion of the variable zz: are, respectively, 
(as is easily verified ) 


aki = ke ®*/(1 = e™) 
and 
ber = We */(1 = ge). 


It follows from a very elementary computation that the mathematical ex- 
pectation and the dispersion of each of the variables X; are, respectively, 


oe ae an an = Din kge ™/(1 — ee) 
— dln &(8)/dp 


II 


and 
b = ra is ber = Pe Roe */(1 = ey 
= d In 6(8)/dg’. 


For the mathematical expectation A and the dispersion B of the variable 
Sy, we therefore find the expressions 


A =Va=—-Vdln0/d8, B= Vb=Vdln&/de’. 


From relation (16) (i.e., from our choice for the value of the parameter 
B) we have, obviously, A = E. Therefore, the application of the one-di- 
mensional local limit theorem in its simplified form [I, §4, (28)] gives 


(17) P(Sy = E + u) = d(2rB) e" + O> + |u), 


where d denotes the spacing (in the sense defined in I, §2) of the sequence 
of energy levels of the system, and u is any integer such that E + u is 
one of the levels. 

Since 


eee = 14 O(2/V), 
it follows that 
P(S, = E + u) = d(2eB)? + OIVO + wv))}. 
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Putting g = E + win formula (15) we find 
UE + u) = {8(8) "PE (d(2eB)? + OVO + wD]; 
and, in particular, for u = 0, 
UE) = {6(8)}"P"{d(2eB)* + O(V)}. 
Hence, 
QUE + u)/AE) = M1 + OVAL + wy + OOY) 
PL + OV + ud}. 


Now we easily obtain a simple asymptotic expression for the mean values 
of the occupation numbers a, . Using (18) and formula (5) of §2, we find 


<a> = Vee l + OVT + met) 
= Dene ™ + 0(V") = (e — 137+ 0(V"). 


Since the number k is repeated Vg, times among the levels £, , the mean 
number of photons with energy k is 


(20) [Vg./(e* — 1)} + O(1). 


Let 2 be a sum function: 


(18) 


(19) 


A= DM, 
where the quantity A: depends only on the coordinates of the ith photon. 
The microcanonical average of the mathematical expectation of Y is, by 
II, §5, (16), 


<A> = D2, <a>, 


where à, is the mathematical expectation of A; (which was assumed to be 
the same for all 7) in the photon state characterized by the energy level 
€. (We recall that the quantity A, depends only on the form of the func- 
tion Y; and is determined independently of any statistical considerations. ) 
The first of equations (19) gives us 


<A> = VA rv Of + O[V (1 + me, )}} 
Ezale — 1)? + OVER | | Ce — 1) 
+ Di | Ar | E eel mee”, 


If we suppose (as it actually happens in the great majority of real physi- 
cal problems) that ^, is a single-valued function of £, (i.e., that the quan- 
tity A; has the same mathematical expectation for all photon states which 
correspond to the same energy level), then all the series on the right side 


ll 
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of the above equation have the form (11) (§3), and therefore by formula 
(12) are quantities proportional to V. (It is, of course, assumed that these 
series converge.) Hence we find 


<A> = Parle — 1) + O(1) 
= Vrs grArle™ — 1) * + 0(1), 


where A, = A, for ¢, = k. Thus, for a sum function the microcanonical 
average is always asymptotically proportional to V. This is, of course, im- 
mediately obvious. The asymptotic formula (21) defines the proportionality 
factor and shows that the relative error in the estimate obtained is ex- 
tremely small. 

In the case of photons, one of the most important sum functions is the 
number of photons N. For given E and V, this number can assume different 
values, depending on the state of the system (see §1). In other words, in a 
definite state of the system, the number N does not, in general, have a fixed 
value. Only its distribution is defined, so that it is only meaningful to 
speak of the microcanonical average <N> of the mathematical expectation 
of the number N. 

If X = N, then obviously Y; = 1, and hence A, = 1 (r = 1, 2, ---). 
Therefore, we find 


<N> = Doha <a> = Pra (7 — 1)” + O(1) 
Vem gee — 1)" + O(1). 
The mean number of photons per unit volume is thus 
<N>/V = Diag — 191+ 0(V"). 
§5. The Planck formula 


To give a concrete meaning to the formulas derived in the preceding sec- 
tion, we must determine the numbers g, and find the physical meaning of 
the parameter 8. We shall consider the second problem in detail later, since 
extensive material, which we have not yet developed, is required to justify 
the universal value given to 8. However, anticipating the result of a more 
detailed discussion, we simply say that in all cases we prescribe for the 
parameter 8 the value 1/kT, where T is the absolute temperature of the 
system and k is a universal constant (the so-called Boltzmann constant). 
Thus, for any arbitrarily complicated system, composed of particles of any 
type, the parameter 8 always characterizes the temperature. 

The determination of the numbers g, , to which we now turn, can be ac- 
complished in several physically different ways. One can start either from 
the wave theory of light, or partly from the wave theory and partly from 


(21) 
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the corpuscular theory, or, finally, from the purely corpuscular theory, 
Naturally, we choose the last way, since it is the only one which corresponds 
completely to the whole spirit of the theory we have developed. It must be 
pointed out, however, that the problem of the determination of the num- 
bers g is in no way connected with statistical considerations. We solve 
this problem merely to present applications of our statistical conclusions 
which can be compared directly with experiment. 

By definition, Vg, denotes the number of linearly independent states ot 
a photon in which its energy lies between k and k + 1. We must therefore 
write down the so-called “time-independent Schrödinger equation” 


KU = EU, 


where X is the photon energy operator and F is a constant, and we must 
find the number of its linearly independent solutions for values of E be- 
tween k and k + 1. However, the usual expression for the energy operator 
of a particle not under the influence of an external field, 


K = —(h/8r m) (8a + d’/day’ + a/a), 
cannot be employed in the case of photons, since m = 0 for a photon, and 


the connection between the energy and the momentum p is not given by 
the equation 


£ = p/2m, 


as it is for a (non-relativistic) material particle. [To put these results in 
their usual form, we use Planck’s constant k directly, instead of A (Planck’s 
constant divided by 27). Both systems of notation are used in physics.| 
In order to determine the operator % correctly, we must start from the 
general relation between the energy and momentum of a particle, as given 
by the theory of relativity 


e= omic! + p), 
where c denotes the velocity of light in vacuo. This relation is valid for 
particles of any type. In the case of photons, m = 0 and 


e = cp =c(p: + py + pe). 


(In the case of material particles if p « me, then £ & me’ + p /2m.) 
Thus, for photons p° is not proportional to e, but to e’, and the operator 
of the quantity ¢’p’, having the form 


(22) —(CR/4r) (0 /dx? + &/ay’ + 8/82’), 


does not correspond to the energy, but to its square. We know (II, §3) 
that it follows from this that the eigenvalues of operator (22) are the squares 
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of the eigenvalues of the operator 3C, and the eigenfunctions of these two 
operators are the same. Hence, if we write the equation 


—(Ch/4r)(PU/dx + PU/dy + PU/d2) = EU 
or, equivalently, 
(23) PUJI + PU/dy + PUJI = -4r E U/ ke, 


then its solutions will be the eigenfunctions of the photon energy operator, 
corresponding to the energy eigenvalue E. The potential energy of a photon, 
assuming that no external forces act on the system, is due only to the po- 
tential of the walls of the container which encloses our photon gas. Inside 
the container this potential is a constant which we may assume to be zero. 
It is well-known that a linear basis of the solutions of equation (23) can be 
obtained from expressions of the form 


(24) U = C sin (ax + 7) sin (by + £) sin (cz + £), 


where a, b, c, n, ¢, &, and C are constants, the first three of which are re- 
lated by the equation 


(25) È +E HHE = rE ke 


Now, since the probability of finding a photon in the neighborhood of the 
point (x, y, z) is proportional to | U(x, y, z) |? and must vanish outside the 
container (and hence, by continuity, also at the walls), we obtain well- 
defined boundary conditions, whose explicit formulation requires some as- 
sumption about the shape of the container. We assume that this container 
is the parallelopiped 0 <x <h, 0 <y <k, 0 <z< i, with volume 
hll = V. Then if any of the six conditions z = 0, y = 0,2 = 0,z=h, 
y = l, z = l is satisfied, we D have U = Q0. The first three conditions 
obviously require that n = ¢ = = 0 in expression (24), so that we ob- 
tain 


U = C sin az sin by sin cz, È +b +e S rE ke 


The requirement U = 0 for x = l , (independently of y and z) necessi- 
tates setting a = nıt/h , where n is an integer; the other two requirements 
lead to analogous relations, so that we must have 


a = mrT/h, b = mr/l, c= ngr/l3, 
where nı , na and n; are integers which, by (25), satisfy the relation 
m/l + ne/l? + ny /ly = 4E ke. 


Thus, the possible values of the energy of the stationary states (i.e., the 
energy levels of a photon) are the numbers E, whose squares have the form 
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kelm / + n2/le + ns /ls), 
where nı , nz and n; are integers. Each triad of integers nı , na , na gives us 
one of the linearly independent eigenfunctions. More precisely, we must 
fix the signs of the numbers m , nz and nz, since, for example, changing 
nı into —m changes U into —U and does not lead to an eigenfunction 
distinct from the preceding ones. For definiteness, therefore, we consider 
only the values m > 0, nz > 0, n > 0. 

The number of linearly independent eigenfunctions corresponding to 
values of the energy E < k, where k is a given integer, is equal to the num- 
ber of solutions with integral nı > 0, n: > 0, n; > 0 of the inequality 

Wen’ /l? + ne/l? + nv /ls’) < k; 
or, equivalently, to the number of points in the first octant, having integral 
coordinates and lying inside or on the boundary of the ellipsoid 
(26) RENE + yl + Èe) = K. 
This number is asymptotically (for large k) equal to one eighth of the vol- 
ume of the ellipsoid (26), i.e., it is equal to 
R(k) = $r (k°/bĊ)hll, = (4r V/3k)k?. 


Hence, the number of linearly independent eigenfunctions belonging to 
energy levels between k and k + dk is approximately 

R'(k) dk = (4r V/h'®)k? dk, 
where the interval dk must be sufficiently large so that the approximate 


formula has meaning, and at the same time small compared to k. However, 
in our previous notation this number is 


VER g, 
so that 
Drt g, = (4r/hPè)k dk. 
Thus, for k < r < k + dk, the value of the number g, is, “on the average”, 
(4r/Ke)k. 
For the majority of calculations one can, without serious distortion of the 
results, assume directly that 
(27) gi = (4x/hic*)k’, 


at least for large values of k. However, one must recall that this is only an 
approximate calculation and that actually the number gą depends on the 
arithmetic value of the number k in a more complicated way. In particular, 
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there can be arbitrarily large numbers k which are not encountered among 
the energy levels of a photon, and for which therefore g, = 0. 

The number (27) is doubled by physicists because of quantum mechani- 
cal considerations which, from the point of view of the wave theory, corre- 
spond to the possibility of different polarizations of light. The fact is that 
the functions U, besides depending on their “classical” arguments z, y, z 
also depend on a peculiar quantum variable (“the spin”), which can as- 
sume only two values. Each of the eigenfunctions which we have defined 
up to now, therefore, describes not one but two different photon states, cor- 
responding to the two values of the spin variable. Thus, finally, we obtain 
the approximate expression 


ge = (8x/hie*)k’, 


which holds “on the average”. 
Substituting this expression in formula (20) of the preceding section, we 
find, for the average number of photons with energies between k and k + 1, 


(28) Br Vk /hè (e — 1). 


The energy € of a photon is related to the frequency v of the correspond- 
ing light wave by the universal equation € = hv, where h is Planck’s con- 
stant. The number of photons with frequencies between » and yv + dr is 
therefore the number of photons with energies between hr and hr + h dp, 
which, by formula (28) is, on the average, 


Bry (hv)?/hick(e™ — 1)]h dv = 8aVv" dv/c(e"*? — 1) 


(since 8 = 1/kT). Since the energy of every such photon is hy, the aver- 
age energy assumed by the whole set of photons with frequencies lying be- 
tween v and v + dr is 


(8rhV/c*)[v° dv/(e”"** — 1)], 
and per unit volume 


(29) (8ah/c*){v* dv/(e”*? — 1)]. 


This is the famous Planck formula, which has played an important role 
in the history of quantum physics. It gives the spectral distribution of 
energy of the so-called ‘‘black-body radiation”. Fifty years ago, after many 
failures of the old classical theory, which constantly gave distributions for 
this spectrum in disagreement with experimental facts, Planck, as a result 
of hypotheses which were bold and risky for those times, arrived at his 
formula (29). It was found to be in excellent agreement with experiment. 
Planck’s derivation of this formula, of course, has nothing in common 
with the one presented here, because Planck based his work on the tenets 
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of the wave theory of light. (He could not have done otherwise in those 
times.) Planck could hardly have surmised that his bold idea would prove 
to be the “first swallow” of that tempestuous spring which in this half- 
century has radically changed the physics of elementary particles and has 
led mankind to great technical conquests, down to the realization of the 
ancient dream of mastering atomic energy. 


§6. On the suitability of microcanonical averages 


According to the general principles of statistical physics, the microca- 
nonical averages of physical quantities are the values predicted by the 
statistical theory. It is these values which we compare with experimental 
data in order to check the theory. Experiment confirms the theory if the 
experimental value of a quantity is near its microcanonical average. How- 
ever, if the possible values of the quantity are widely scattered, i.e., if in 
the majority of cases the values are markedly different from one another 
(and thus also markedly different from the microcanonical average), then 
we certainly cannot expect that a single experiment will give us a result 
which is near the calculated average. Further, situations are possible in 
which the experimental value will surely differ strongly from the average 
value (e.g., a game in which the player has probability 4 of either winning 
or losing a large sum). In such cases, the average value (regardless of the 
method of averaging) can tell us nothing concerning the result of an ex- 
periment. We cannot even count on the average value of the results of a 
long series of experiments being near our microcanonical average: For this 
it would be necessary that the frequencies, with which the different pos- 
sible states of the system occur in our series of experiments, conform to the 
weights given them in the microcanonical average. We have no reason to 
expect this. In the first place, we chose the microcanonical method of aver- 
aging for a number of theoretical reasons, and we did not take into account 
experimental conditions; but, it is obvious that the actual frequencies 
under discussion will always be different under different conditions of 
experimentation. Hence, no method of averaging can exist which would 
correspond in all cases to the actual experimental conditions. 

The situation is entirely different if the quantity being studied is weakly 
dispersed, i.e., if the great majority of its possible values are not very dif- 
ferent from one another. In this case, obviously, for any method of averag- 
ing (within broad limits) we shall find, for the average value, approximately 
the same number. This number will be close to the great majority of pos- 
sible values of the quantity being studied. Hence, it will be close to the 
great majority of the results of our series of experiments. In this case we 
have every right to expect that a single experiment will give us a result 
near the computed average. 
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All these considerations were given, in substance, in Chapter III. We 
feel, however, that it is necessary to draw the attention of the reader to 
them repeatedly, because it is just these considerations which describe the 
connection between the physical world and the theory we have developed. 

We saw in III, §6 that the ‘“microcanonical] dispersion” 


D(A) = <E — a)"]> 


of a physical quantity A, with microcanonical average a, plays an important 
part in determining the suitability of this average. If, as the number of 
particles N approaches infinity (and the energy E of the system increases 
proportionately), the quantity D(A) becomes infinitely small compared 
to N° (or, equivalently, to Æ’), then we can be sure that | AX — a | will 
be as small as we like compared to | a | with a probability arbitrarily close 
to unity for any distribution (within wide limits) of frequencies of occur- 
rence of the possible states of the system. This is precisely what we call the 
“suitability” of the microcanonical average a. This suitability will com- 
pletely justify our arbitrarily chosen method of averaging and will permit 
us to expect close agreement between microcanonical averages and experi- 
mental facts. It will be guaranteed whenever we are able to prove the rela- 
tion 


D(X) = (F°). 


We saw in the same section (III, §6) that if A is a sum function, then 
D(X), in any of the three basic statistical schemes, is given by the expres- 
sion 


DOOD = X 2i (u — Av) <a> 


(30) 
+ Dan DA AAs[ <5 > E <a> <a>]. 


Here it is assumed that 
A= DLA, 


where the quantity A; depends only on the state of the ith particle. If the 
ith particle is in a stationary state with energy ¢,, then A, and u, denote 
the mathematical expectations of the quantities A, and Y7, respectively. 
(These expectations are assumed to be the same for all i.) To estimate the 
microcanonical dispersion D(A), we must find asymptotic expressions for 
the microcanonical averages <a,> and <a,a,> of the occupation num- 
bers and their pairwise products. We can foresee that we must obtain these 
asymptotic expressions rather accurately, since it is natural to expect that 
in the differences <a,a,> — <a,><a,> a series of important terms will 
cancel one another. In fact, the accuracy to which we determined the num- 
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bers <a,> in §4, while fully satisfactory there, is quite insufficient for our 
present purpose. Therefore, the asymptotic calculation of these numbers 
must be carried out anew, on a more accurate basis. For this we need only 
a well-known form of the local limit theorem. The simplified formula (28) 
of I, §4 is no longer sufficient, and we must make use of the more accurate 
formula (27) of the same section. Instead of formula (17) of §4 we now ob- 
tain the more accurate formula 


P(Sy = E + u) = d(2eB) e" + (h + hu) B’ 
+ OV (vi + |u f’), 


where d and u have the same values as before, and l and lı do not depend 
on either # or u. Using the fact that 


e" 1 — wB” + O(u'B”), 
we find 
P(Sy = E + u) = d(2rB)> 
+ (b + hu + laf) B + OVV? + u’), 


where l = —4d(2x)*. With q = E + u, formula (15) of §3 therefore 
gives an expression for the structure function which is more accurate than 
the one we had in §4: 


QUE +u) = [E8] PF {a(21B) > + (lo + hu + lw’) B? 
+ OV (VÈ + u); 


(31) 


and, in particular, for u = 0, 
UE) = (0(8)}"e"(d(2eB)* + LB? + O(V™)}. 
By an elementary calculation we obtain 
(32) Q(B + u)/Q(E) = M1 + (ku — pw) B™ + OV?! + uD}, 


where k; is a constant (independent of both E and u). This more accurate 
formula is used to replace formula (18) of §4. Now, proceeding to the deriva- 
tion of the asymptotic formulas for <a,> and <a,a,>, we introduce the 
following convenient notation: We put 


T; = T,(B) = (ef = 1)7 = Dal epr, 
Then, clearly, 
et aan m = (—1)* dT,/ap* = (—1)*7,™. 


In view of equation (32), formula (5) of §2 now gives us [in analogy with 
(19) of §4] 
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<a> = $ Za QUE — me,)/(E) 
= ee Pl — (kme + imer) B 

+ OVÈ + me) 
= T, + (mT, — 4T,”)B™ + OVIT, V? + T,”)). 


This is a more accurate formula than equation (19) of §4. In our new nota- 
tion the latter would have the simple form. 


<a> = T, + 0(V`™'). 
Further, by using (32), for r ¥ s, formula (8) of §2 gives 
<as> = Xka D aQ E — ke, — le,)/Q(E) 
= Da Dire t1 — By (ke, + les) 
+ (ker + le.)”)] + O[V(V! + (ker + les)*)} 
= T,T, + BkK(T.T! + TT) — aT,” + 27,7! 
+ TT”) + OVO TT + TTY + TT”). 


(33) 


At the same time formula (33) yields 
<a> <a> = T,T, + B kT, T? + TT) — 4T, T” + TT,”)] 
FOV AFER ERTO EETA, 
From the last two formulas we find 
<aa,> — <a> <a> = -T T/B! 
+ OV (VIT.T, + TTP + TTO). 


Thus, the coefficient of microcanonical correlation of the numbers a, 
and a, (r = s), as was to be expected, is always negative and infinitely 
small in absolute magnitude, of order V~’. This latter fact follows from equa- 
tions (34) and (35) (see below). 

Formula (34) provides the information necessary to estimate the micro- 
canonical dispersion D(A). For this purpose we must first notice that in 
view of (33), the quantities <a,> are asymptotically equal to the quan- 
tities T, , which are functions of €, . Because of this, the first term on the 
right side of equation (30) is, by the general formula (12) of §3, asymp- 
totically proportional to V. In the second (double) sum, we must dis- 
tinguish between the terms where r = s and those where r = s. In the 
terms of the first type, by (34), the numbers <a,a,> — <a,><a,> are 
asymptotically equal to — T,’ T; B7! Hence, the double sum of these terms, 
on the basis of the same formula (12) of §3, gives a quantity which is 


(34) 
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asymptotically proportional to V°B™, i.e., again proportional to V. Finally, 
the terms of the second type have the form A, [<a > — (<a,>)"] and 
the sum over r extends from one to infinity. With the help of formulas (9) 
and (32), we can find the asymptotic expression for <a, > — (<a,>)’: 


(35) <a> — (<a>) X TT, + 1). 
Therefore, the sum of terms of the second type is asymptotically equal to 
Doea `r TT, + 1), 


and hence, by formula (12) of §3, it is asymptotically proportional to V- 
(In all these estimates, of course, it is assumed that | à, | and u, do not iw 
crease too rapidly with increasing r, so that all the series obtained are ab- 
solutely convergent. ) 

Formula (30) thus proves that D(A) is asymptotically proportional to 
V. But, as we saw above, in order to establish the suitability of the mean 
value <A>, it is sufficient to have 


D(X) = o( E’); 
or, equivalently, 
D(A) = (V°). 


Hence, the estimates we have obtained easily prove the suitability of the 
microcanonical averages of sum functions. As was noted above, this justi- 
fies, on the theoretical side, the arbitrary choice of the microcanonical 
method of averaging, and, on the practical side, explains the great value of 
the microcanonical averages for making comparisons with experimental 
results. 


Chapter V 


FOUNDATIONS OF THE STATISTICS 
OF MATERIAL PARTICLES 


§1. Review of fundamental concepts 


We shall now see that the methods developed in the preceding chapter 
can also be used to construct a statistical theory for a system of material 
particles. The basic new fact which must be considered in this transition is 
that the number N of particles composing the system is now rigidly fixed 
(just as the total energy E of the system). Hence, the possible states of a 
system, with a given total energy E and composed of N particles of a par- 
ticular type, are now described by the eigenfunctions of the operator # 
which correspond to the eigenvalue Æ. In the case of complete statistics, 
all such eigenfunctions are “admissible”; but in the case of symmetric (or 
antisymmetric) statistics only symmetric (or antisymmetric) eigenfunctions 
are admissible. In all cases, we select a (finite) linear basis of mutually or- 
thogonal and normalized functions from the family of all admissible eigen- 
functions. The number of functions in this basis depends on N and E. We 
denote this number by Q(N, E) and call it the structure function of the sys- 
tem. The fact that the structure function Q(N, E) depends on two argu- 
ments is the reason that the statistics of material particles is more compli- 
cated in formal respects than the statistics of photons. 

We saw in III, §4 that, in any of the three basic statistical schemes, the 
functions composing the above mentioned linear basis can be chosen from 
among the so-called “fundamental” eigenfunctions of the system. In a state 
described by such a function, the number a, of particles in a state with en- 
ergy level £, has a definite value. Hence, a uniquely defined set of ‘‘occupa- 
tion numbers” a, (r = 1, 2, ---) corresponds to each “fundamental” eigen- 
function which is a member of our basis. The numbers a, must satisfy the 
relations 


(K) ma =N, Dac = E. 


Conversely, a specific number of terms of the linear basis corresponds to 
each definite set of occupation numbers satisfying the equations (K). This 
number is different for the different statistical schemes. We saw (III, §5) 
that it can be represented as 


(1) eN yla), 


where C(N) = N! in the case of complete statistics, C(N) = 1 in the case 
of the other two statisties, and 
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yla) = 1/a! (complete statisties), 

yla) = 1 (symmetric statistics), 
{1 (a<1) 

y(a) = (antisymmetric statistics). 
0 (a> 1) 


From the above we can obtain expressions for the structure function 
Q(N, E) for all three cases. We find in the case of complete statistics 


(2) QN, E) = Daw NTT a!) 


in the case of symmetric statistics 

(3) Q(N,E) = Yi, 
and in the case of antisymmetric statistics 

(4) Q(N, E) = Lin 1, 


where in the first two cases the summation extends over all sets of integers 
a, > 0 satisfying the conditions (K). The asterisk (*) in the third case 
indicates that the summation extends only over those solutions of the equa- 
tions (K) for which a, < 1 (r = 1, 2, ---). These three equations can, of 


course, be combined into the single equation 
AN, E) = CN) Vw [a vla), 


where the summation extends over all sets of integersa, > 0 (r = 1,2, +--) 
satisfying the relations (K). 

As stated in III, §2, we take for the microcanonical average of any phase 
function (i.e., any quantity which assumes a definite value in each of the 
basic states of the system) the arithmetic mean of the values which it as- 
sumes in those Q(N, E) states which are described by the eigenfunctions 
of the linear basis we selected. The justification for this choice of method 
of averaging can only be given later when the question of the suitability of 
microcanonical averages is considered. 

The above statements constitute the starting points for the following in- 
vestigations. 


§2. Mean values of the occupation numbers 


We have already had many opportunities to note that an important prob- 
lem of a statistical theory is the determination of the mean values of the 
occupation numbers a, as well as those of their pairwise products a,a,. In 
the photon case discussed in §2 of the preceding chapter, we expressed the 
mean values of the numbers a, and a,a; in terms of the structure function 
of the system. We must now take this first step for the case of material 
particles and, in addition, for all three basic statistical schemes. 
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Again, let a,(U) be the value of the number a, in the state of the system 
described by the fundamental eigenfunction U. In the sequel we denote 
1/Q(N, E) by Q~. Then 
(5) <a> = A) vaU), 
where the summation extends over all the fundamental functions of the 
linear basis. 

1. Complete Statistics 


In the case of complete statistics, to each set of occupation numbers 
a, (r = 1, 2, +--+) satisfying the conditions (K) of §1 there correspond 
N! ([]?-: a!) different functions of the linear basis and, hence, this 
same number of terms is present in the sum on the right side of formula (5). 
Therefore, in the case of complete statistics 


<a> = m N! (Ii ai!) ar, 


where the summation extends over all sets of occupation numbers which 
satisfy the conditions (K). 

Since in the sum on the right side of this formula all the terms for which 
a, = 0 vanish, the conditions (K) which have the form 


a>0 (=1,2,--), Braus N, Xaa: = E, 
can, without altering the result, be replaced by the conditions 
(L) a>0, a2>0 (žr) Bauaus=N, Dtiae=EF. 
Hence, 
<a> = NIY P o a[i a), 


where the summation extends over all systems of numbers a; satisfying the 
conditions (L). Now, if we put b, = a, — 1, b; = a; (i # r), then the con- 
ditions (L) are equivalent to the conditions 


(L) b,>0 (i = 1,2, ---), 
She N=, Siibe = 2 = &, 
and we obtain 

<a> = N! Ye» (Tits b). 
In view of equation (2), 

Dun (N — 1)! ([] Pb: = O(N — 1, E — &), 

and it then follows that 
(6) <a,> = NYAN — 1,E — &). 
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The mean values of the products a,a; are found analogously. For r = s 
we have 
Al =f 
<aa,> = N! V'È w aa [21 a). 


The only terms of the sum } (x) different from zero are those in which 
a, > 0, a; > 0. Therefore, 


<a,0)> = NIY Yocw).0,>0.0,>0 MGe( ] [ier a!) 
Putting b, = a, — 1, bs = as — 1, b; = a; (i ¥ r,i ¥ s), we obtain 
<ads> = N! Z an (F b)", 
where (M) denotes the set of conditions 
o b>0 G@G=12 9), Enb =N-2, 
Dh bi = BE - & c. 
Again, in view of equation (2), 
anh — 2)! (2: b: = QUN — 2, E — 8), 
and, therefore, we find 
(7) <a,.> = N(N — 1)0'Q(N — 2, E — & — &). 
Finally, we find 
<a(a,—1)> = NIS w ala — 1)([] 84 a: 
INI" P w22 0a — ibaa. 
Setting b, = a, — 2, b; = a; (i ~ r) in the last expression it follows that 
<ala — 1)> = N19) * (D b) 
= N(N — 1)\Q7Q(N — 2, E — 2e,), 


where > denotes summation over all b; such that b; > 0, pee b; = 
N — 2, and bar be; = E — 2e, . Therefore, in virtue of equation (6) 


(8) 


<ar > = <a lar mE, 1)> + <a> 


9 
(9 = O'\N(N — 1)Q(N — 2, E — 2e,) + NQ(N — 1,E — e,)}. 


2. Symmetric Statistics 


In the case of symmetric statistics formula (3) of §1 shows that Q(N, E) 
is the number of solutions of the system of equations 


iat a,=N, a ae = E 
with integral a; > 0. We denote by T, (k = 1, 2, ---) the number of solu- 
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tions of the same system which satisfy the subsidiary condition a, > k. 
Then, evidently, T — IT} will be the number of solutions of the system 
of equations 


a, = k, ai> o0 (i= 1,2,---), 
it: = N, Da at; = E, 
whence 
<ar> = Y) ka R(T. — Pen). 
Using the Abel transformation, we easily obtain from this the result 
<a> = 0 a rh. 


But, on the other hand, putting b, = a, — k, bi = a; (i Æ r), we find that 
T, is the number of solutions, with integral b; > 0, of the system 


Dra bi= N-k, Di bie; = E — ke, , 


T, = O(N — k, E — ke,), 


and, therefore, 


(10) <a> = S RN — k, E — ke,). 
Furthermore, for r = s, we denote by Ty, (k = 1,2, --- ;l = 1,2, +++) 
the number of solutions of the system 
Duar =N, PBPhas=E (a; > 0,a, > k, a > D). 
Then, as is easily calculated, the number of solutions of the system 
Danas N, DVtaae = EF (a; > 0, a, = k, as = 1) 
is equal to 


Tee — Vesta — Tea + Pease. 
Hence, 
<at> = OD kia kUn — Pear — Teg + Mets}, 
and, by using the Abel transformation twice, 
<a> = VD km Dr. 
Putting 
b; = a; (i = r,i Æ s), b = a,— hk, b= as — L, 


we easily find as before that 
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Tyr = QN —k —1,E — ke, — les). 
Consequently, 
(11) <at> = > km AN —k — 1, E — ke, — les). 
Finally, a completely analogous computation yields 
<a> = O° De k — Pa) = ODOR (2k -— IT, 
from which it follows that 
(12) <a> = 0} g (2k — 1)Q(N — k, E — ke,). 


3. Antisymmetric Statistics 


In this case, the numbers a, can assume only the values 0 and 1. If we 
denote by 2,(N, E) the number of those solutions of the system 


(N) D241 a; = N, Dom ae; = E, 0<a;<1, 
in which a, = 1, then obviously 
<a> = 27'0,(N, E). 
Denoting by 2,,(N, E), the number of those solutions of the systen (N) 
in which a, = 0, we have on the one hand that 
0,(N, E) + 9,(N, E) = a(n, E). 


But, on the other hand, putting b: = a; (i # r), b- = a, — 1, we see that 
0,(N, E) equals the number of those solutions of the system 


Deb =N-1, Dtibes = E-e, O< bs <1, 
in which b, = 0, i.e., 2,(N, E) equals Q,.(N — 1, E — c£). Consequently: 
0,(N, E) = Q(N — 1, EB — &) — O(N — 1, E — &). 

Successive application of this recurrence formula yields 

0,(N,E) = De, (-1)"'0(N — k, E — ke,), 
and, therefore, 
(13) <a> = S a (-1)" 'Q(N — k, E — ke,). 
Now, suppose r ¥ s and let 1) Q,.(N, E), 2) Q (N, E), 3) Qr(N, E) and 
4) Qs (N, E) denote, respectively, the numbers of those solutions of the 
system (NV) in which 1) a, = a, = 1,2) a, = 1,a, = 0,3) a, = 0,a, = 1, 
and 4) a, = a, = 0. Then, first of all 

QN, E) = Qu(N, E) + Qe (N, E) + Q7(N, E) + Qs (N, E), 
and 
<at> = VN, E). 
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On the other hand, our customary method of transforming from the 
numbers a; to the numbers b, easily yields 
Re N, E) = O(N — 1, E — 6) = Ge(N — 1, E — es) 
Qro (N — 2, E — & — €s) 
QN — 2, E — &, — &:) — Qe(N — 2, E — & — &) 
— Qr(N — 2, E — € — £) — O(N — 2, E — £ — &5). 


ll 


By repeated use of this formula, we find 

Q(N,E) = fina (-1)*0(N — k — l, E — ke, — les), 
and, therefore (for r = s), 
(14) <ma> = Y’ k (1) "(N — k — l, E — ke, — les). 


Finally, in the case of antisymmetric statistics it is always true that 
a; = a, , so that in virtue of (13) 


(15) <ař> = <a> = VÈ za (17AN — k, E — ke,). 


By looking at the set of formulas (6)-— (15) and recalling the definition 
of 2’, we see now that in all three statistical schemes the mean values 
<a,>, <a,a,.> (r = s) and <a,’> are expressed very simply by ratios 
of the form 


(16) AN — u, E — v)/Q(N, E), 


where u and v are positive numbers. Consequently, if we find sufficiently 
simple and accurate estimates of (16), then we shall be able to obtain the 
required asymptotic estimates for the mean values of the occupation num- 
bers, their squares and their pairwise products. We shall find such estimates 
in the next section. 

However, we shall first derive one more simple formula which will be 
needed later. We know that in the sequence of energy levels €, (of the indi- 
vidual particles comprising the system) the same number can occur more 
than once, i.e., it is possible that £, = £, forr ¥ s. In this case, for brevity, 
let us put <a,a,;> = rt,. We obtain an expression for the quantity 7, for 
our three basic statistical schemes by putting £s = €, in formulas (7), (11) 
and (14). 

In the case of complete statistics formulas (7) and (8) easily yield 


(17) <a,> — <a> = <a (@ — 1)> = 7. 

In the case of symmetric statistics we find, in virtue of (10), (12) and (11), 
<a, > — <a> = 20° Dina (m — 1)Q(N — m, E — me,) 

S ea ON — k — l, E — ke, — le,) 

27;. 


(18) 


l 
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Finally, in the case of antisymmetric statistics a,” = a, , and consequently, 
(19) <a, > — <a> = 0. 


By using the “index of symmetry” o of a system, introduced in ITI, §3, 
formulas (17), (18) and (19) can be combined into the one formula 


(20) <a> — <a> = (e + 1)r, 
which, therefore, holds for all three fundamental statistical schemes. 


§3. Reduction to a problem of the theory of probability 


Now we must consider the reduction of the problem of finding an asymp- 
totic estimate for the function Q(N, E) to a certain limit problem of the 
theory of probability. We indicated the general method by which this can 
be done in §3 of the previous chapter. However, we must take into account 
two essential facts which differentiate our new problem from the one we 
solved in IV, §3: 1) We dealt only with symmetric statistics in the photon 
case, whereas now we must include all three schemes; 2) The structure func- 
tion Q(E) in the photon case depended only on the single variable E; now 
Q(N, E) depends on the variables N and E in each of the three funda- 
mental schemes. This last fact implies that the solution of our new problem 
will require use of the two-dimensional limit theorems of the theory of 
probability. 

As in the previous chapter, let us denote by g, the number of energy levels 
of an individual particle included between & and k + 1, when the system 
occupies unit volume. Following the same reasoning used in IV, §3, we 
again conclude that the number gą becomes Vg, when the system occupies 
the volume V. Hence, the structure function Q(N, E) depends strongly on 
V. In deriving our asymptotic formulas, we will assume that the numbers 
N, E and V approach infinity but remain in constant ratios. Hence, equa- 
tion (12) of §3 of the preceding chapter remains valid so that here also the 
sum of each absolutely convergent series of the form 


Dosa f(s) 
is a quantity proportional to V (and to each of the numbers N, E). 
Let us now introduce the two-dimensional distribution p,(x, y). This 
distribution is defined for each integer k as follows: 
1) The random variable z can assume only non-negative integral values; 
2) Ife = n (n = 0, 1, ---), then y = nk; 
3) The probability that z = n, y = nk, is 


P(x = n, y = nk) 


2 ; 
en = y(n) rn eo v(ie PPT! (n= 0,1, +++), 
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where a and 8 are parameters, whose values we shall choose below, and 
y(n) is a function of the integral, non-negative variable n. We introduced 
y(n) in Chapter III and defined it for the various kinds of statistics 
as follows: 


1) for complete statistics, y(n) = 1/n!, 
2) for symmetric statistics, y(n) = 1, 
(I) 43) for antisymmetric statistics 


(1 forn = Oandn = 1, 


y(n) = l 





0 forn > 1. 


It is understood that the values of the parameters a and 8 must be chosen 
so that the series in brackets on the right side of (21) converges for all 
k >l. 

The distribution p(x, y) just defined is evidently degenerate: All pairs 
of possible values of (z, y) obeying this law are located on one half of the 
line y = kx. However, this fact will not be important in our development. 

Now let us consider the infinite sequence (x; , y;) (i = 1, 2, ---) of pairs 
of random variables among which there are 


gı pairs subject to the law pi(a, y), 
gz pairs subject to the law po(z, y) 
and, in general, 
gx pairs subject to the law p(z, y) (k = 1,2, -+-). 


We assume that the pairs with different subscripts are mutually inde- 
pendent. 

The probability that the pair of random variables (x, y), distributed ac- 
cording to the law p(z, y), have a value different from zero can be found 
from equation (21). This probability is 


Drava tea yao 
= eg (ate) Sante oo o yli tg < t, 


since y(n + 1) < y(n) (n = 0, 1, 2, ---) for all three kinds of statistics. 
Consequently, the probability that at least one of the g, pairs of our se- 
quence, distributed according to the law p(x, y), havea value different from 
zero is less than 


—a—pk 


Gre 
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In the case of material particles, as we shall see below, the quantum 
theory gives different values for the numbers g, from those found in the 
case of photons. However, here too the series 


Pina ge * 


always converges for all 6 > 0. As in the photon case, we therefore easily 
conclude that the series 


Dots ai = X, aa y= Y 


both converge with probability 1. The distribution of the pair of random 
variables (X, Y) is clearly not degenerate. It will be denoted by P(X, Y). 
Evidently the form of this law depends only on the parameters a and 8 and 
on the nature of the particles composing the system. The law P(X, Y) plays 
a fundamental role in the remainder of this chapter. 

As in the photon case, we consider the volume V occupied by our system 
to be an integer. Let us consider V mutually independent pairs of random 
variables (X;, Y:) (¢ = 1, 2, --- , V) distributed according to the same 
law P(X, Y). Let us put 


X= Srv, fat Y; = Ty, 

C(p) =p! for complete statistics, 

C(p) = 1 for the other two statistics. 
Let us also assume that the series 

D=o Doro € Alp, q)/C(p) = la, B) 
converges for the values of the parameters a and 8 which will be chosen 
below. Then we obtain the following theorem: 

THEOREM. For any pair of non-negative integers p, q, 

(22) Q(p, q) = C(p)®(a, B)e****P(Sy = p, Ty = 9), 


where the last factor on the right side denotes the probability of the simultaneous 
fulfillment of the equalities Sy = p, Ty = q. 
Proof. For brevity, let us put 


{dunno y(nje “} = T(z). 
Then p:(x, y) can be written more briefly as 
P(x = n, y = nk) = Pla + Bk)y(nje tP (n = 0,1, ---). 


We defined the pair of random variables (Sy, Ty) as the sum of V mu- 
tually independent pairs (X;, Y;) (¢ = 1,2, ---, V) cach of which is dis- 
tributed according to the law P(X, Y). This law P(X, Y) is, in turn, the 
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distribution of the sum of an infinite series of mutually independent random 
pairs among which there are g, pairs distributed according to the law 
p(x, y) (k = 1, 2, ---). This shows that the pair (Sy, Ty) can be con- 
sidered as the sum of an infinite series of mutually independent pairs among 
which there are Vg, pairs distributed according to the law 


p(z, y) (k = I; 2, ea 


We will denote these latter random pairs by (£r, yer) (J = 1,2, --- , Vg), 
so that 


Sy = Den 2a Tkl, Ty = pe Be Yki - 


The pair (zr: , yer) is, of course, distributed according to the law p(x, y). 

For the variables Sy , Ty to have the values p, q, respectively, it is neces- 
sary that the values £e: = Me, Yer = Knut, assumed by the random vari- 
ables x1, yx: , Should satisfy the relations 


(P) Dra xo Nk = P, Dra kS Ni = q 


Let mun (1 <k < ~,1 <1 < Vg.) bea definite set of non-negative inte- 
gers satisfying the equations (P). Then, because of the mutual independ- 
ence of the pairs (£1, Yxı), the probability that £r = mir, Yer = knr 
(l<k<o,1<1< Vg,) will be 


I 0 Parn = rer, yer = kna) 

= [Tis TS ra + Bk yy mee err? 
{Tea (Pa + pk)’ e e ia E y (ner) 
= TTA (P(e + Bk) [Tes T (mer), 


where Z= DR Prs Nikl, Z = Irai EJS Nikl 

This is the probability that the random variables £r: , yx: , respectively, 
will assume definite values nx: , kn. which satisfy the equations (P). Be- 
cause of the above, the probability that Sy = p, Ty = qis the sum of prob- 
abilities of the type (23) extended over the whole system of non-negative 
integers nı Which satisfy equations (P), i.e., 


P(Sy = p, Ty = q) 
= e PPT Te [l(a + pk) \"%} sae Me Ts y(n). 


Now, if we recall that in the sequence of energy levels 


(23) 


(24) 


Oy bt i 8 yi Ep ys 


of a particle the integer k appears Vg, times, then we note that the sum 


Dom» [Dea [r4 y(n) 
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on the right side of (24) differs only in notation from the sum 


Dw Il yla), 


which is used (see III, §5) to define Q(p, q), if the conditions (K ) have the 
form 


(K) D a, = P, Der GrEy = Q. 


[In the case of complete statistics, the sum > (x) must be multiplied by p! 
in order to obtain Q(p, q).] Therefore, we obtain 


C(p) Dom [tes Trt y(n) = 2p, q), 


where C(p) = p! for complete statistics and C(p) = 1 for the other two 
statistics. Hence equation (24) yields 


(25) P(Sy = p, Ty = qg) = e 7? [Jes (D(a + pk) Q(p, g)/C(p). 
Summing this relation over p and g from 0 to œ, we find 


1 = {INe (P(e + Bk) DEF ono Rp, @)/C(p)} 
= {[ [i (f(a + 8k) Ea, 8). 
Hence, 
(26) [Ti rla + ge = [&(a, B))™, 
and therefore (25) yields 
P(Sy = p, Ty = g) = & ** "'Q(p, q)/#(a, B)C(p), 


which is equivalent to (22). Thus, our theorem is proved. 

Formula (22) reduces the study of the properties of the function 2(p, q) 
to the investigation of the distribution of the pair of random variables 
(Sv, Ty). This pair represents the sum of an infinitely large number V 
of mutually independent random pairs which are identically distributed. 
Hence, we are led to one of the most thoroughly discussed limit problems 
of the theory of probability for which a very accurate solution is known. 
Let us note that the presence of the factor ®(a, 8) on the right side of (22) 
cannot cause any difficulty because the expressions for the mean values 
of the various phase functions always contain only ratios of structure func- 
tions 2. Consequently, this factor always cancels in such expressions. 

Let us make another remark of subsequent interest. In addition to de- 
pending on the parameters a and £, the function (a, 8) depends on the 
form of the function Q(p, q) which in turn depends on the volume V oc- 
cupied by the system. Hence, the function ®(a, 8) also depends on V. The 
form of this dependence is very simple as equation (26) shows: Denoting 
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by (a, 8) the expression for the function ®(a, 8) when V = 1, we find 
(27) (a, 8) = {&(a, B)}", In O(a, 8) = V In &(a, 8). 


Thus, the function In @(a, 8) is directly proportional to the volume V oc- 
cupied by the system. It is easy to see that the dependence of the structure 
function 2(p, q) on the volume has a considerably more complex character. 


§4. Choice of values for the parameters a and 6 


The values of the parameters a and £ were until now restricted only by 
the general requirement that the series of interest to us converge. Otherwise, 
these values remained arbitrary. Now, before proceeding to the application 
of the limit theorem of the theory of probability, it will be expedient for us 
to select these values so that subsequent computations will be as simple as 
possible. The present section is devoted to this selection. 

In §8, we defined the function 


Bla, B) = Xa ? “A(p, q)/C(p) 
for all three statistical schemes. Further, we showed [formula (26)] that 
(a, 6) = [Jia Tla + Bk)"; 
or, equivalently, that 
(28) (a, 8) = ID Ela + Be)? = [TR (Dione err}, 


where the function y(n) for each of the three statistical schemes is defined 
according to rule (I) of §3. 

Let us denote by Eo the smallest possible value of the energy of a system 
composed of N particles of a given type. It is obvious that in the case of 
complete or symmetric statistics Fo = Ne, . However, in the case of anti- 
symmetric statistics, where not more than one particle can be found in 
the state characterized by the level £, , we will have Eo = ye Ep 

Now let us prove the following general proposition: 

THEOREM: Let a system composed of N particles have energy E > Eo. 
Then the set of equations 


(29) ð Iin &/ða = —N, ô ln $/əðß = —E 


has a unique solution (a, B) for 8 > 0. 
Proof. We consider the function 


F(a, 6) = e7%*?¥(a, B) 
= et N tbe su { pee (nye "a tFeny 


and we study its behavior in the half-plane 6 > 0. First, we establish suc- 
cessively that. 
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1°. F(a, 8) — © uniformly with respect toa (— œ <a<~)ass—0, 

2°. F(a, 8) — © uniformly with respect to a (— © <a< œ) as 8 —> œ, 

3°. F(a, 8) — œ uniformly with respect to 8 (0 < B < Bo) as |a | —> =œ. 
(Bo is any positive number. ) 

In order to prove 1°, we note that y(0) = y(1) = 1 for all statistics 
and, therefore, 


Fla, B) > er TA + een. 


Let A be an arbitrarily large integer and let 6 be small enough so that 
er > 2 (r < A). Then 


In F(a, 8) > aN + DOA In {1 +e] > aN + Aln {1 + 4074 = f(a). 


An elementary computation easily shows that the function f(a) has its 
smallest value for a = In [(A — N)/2N] = ao. Consequently, 


In F(a, 8) > f(a) > f(a.) > Nao = N In [(A — N)/2N). 


Since A can be as large as desired for small enough £, statement 1° is proved. 

Let us turn to the proof of 2° and 3°. First let us assume that we are con- 
cerned only with complete or symmetric statistics, so that Eo = Ne, and 
y(n) > 0 for any n > 0. Since y(0) = 1, 


EL, (ke Matte sy, r>1l, 
and in virtue of (28), 
D(a, B) > Pemo ylk), 
Similarly for any m > 0 
Bla, B) > yim) D, 
whence 
(30) F(a, B) > y(m)e Ene, 
Putting m = N, we find 
F(a, B) > y(N)&”, 


Since E > Eo by hypothesis, 2° is proved. Further, putting m = N + 1 
and m = N — 1 in (30), we find 


Fla, B) > y(N + le? *, 
and 
F(a, B) > y(N — 1), 


The first of these inequalities shows that F(a, 8) — © uniformly as 
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a—> — æ, (0 < 8 < b). The second shows that F(a, 8) > © uni- 
formly asa — œ, (8 > 0). Hence, statement 3° is also proved. 
Now, let us turn to the case of antisymmetric statistics. Here Eo = 


yon € and 
F(a, b) 


etn thE ae al ae got) 


= ee TE (ere + Dee Tay (1 + et ry: 


(31) 


Therefore, 
Far 8) eg OT ey Se a ee), 


Because E > Ey, this inequality proves 2°. But here we evidently also 
obtain 3° for the case a —> œ. In order to prove 3° for the case a — — œ, 
it is sufficient to note that, just as before, it follows from (31) that 


Fla, b) > f° 7 + eo Ort), 


From this we find that F(a, 8) — œ uniformly as a > — © in any inter- 
val 0< 8 < Bo. Hence, statements 1°, 2° and 3° are proved for all three 
kinds of statistics. 

It evidently follows from the three statements proved above that F(a, 8) 
approaches infinity uniformly as 6 — 0 and as ao + Ëo. Therefore, 
it assumes a smallest value at a certain point (a, 8) in the half-plane 8 > 0. 
At this point we have 


ô ln F(a, B)/da = N + ð ln ®(a, B)/da 
ð In F(a, 8)/08 = E + ô ln ®(a, B)/a8 

and, hence, that 
3 ln/ðæ = —-N, o9lné&/op = —E. 


ft ol 
o o 


Thus, the existence of a solution of the system of equations (29) in the 
half-plane 8 > 0 has been proved; the uniqueness still remains to be proved. 
For this purpose, let us assume that the point (a’, 6’), which is different 
from the point (a, 8), also satisfies this system of equations. Then, since 
the second derivatives of the function In F coincide with the corresponding 
derivatives of the function In $, 


In F(a, 6) — In F(a’, B’) 
= 4{(a — a’)? In B/da’ + 2(a — a’) (B — B’)? In 8/daaB 
+ (8 — 6’)’8 In &/a6"}, 
where all three second order derivatives are evaluated at the same point 


la + O(a — a’), B’ + (8 — 8’)] (0<6@< 1). 
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We quickly arrive at the desired contradiction if we show that the quad- 
ratic form on the right side of the last equality is positive definite, because 
then 


In F(a, 8) — In F(a’, 6’) > 0, 


which is impossible in view of the definition of the point (a, 8). Thus, we 
must still prove that 


a In &/da° > 0, 
(8? In &/da’) (3? In 6/08’) — (a? In &/aadB)’? > 0 
identically in the region B > 0. 


We have 
In (a, 8) = Eea In {F2 y(n en rte} 
= f 2, In S,(a, b), 
where 
Stas B) E yea y(n) tn, 
Hence, 


(32) & In p/a Dra SiS, S/a — (aS,/da)*} = DOR, T, 


where 

T, = S, {8,0°S,/da’ — (d8,/da)}. 
Since 

OS,/da = — Pko ny(n)e tn, 
and 


aS,/d0° = Xa nyn) etn, 
the Schwarz inequality easily yields 
T,>0 (r= 1, 2,0); 
and therefore, by (32), 
ə In &/da’ > 0. 


Furthermore, it is easy to see that if a differentiation with respect to a 
is replaced by a differentiation with respect to 6 in any partial derivative 
of the sum S,, then the partial derivative is multiplied by ¢,. Therefore, 
it follows from (32) that. 

ð In &/dadB8 


3 In &/ag" 


I 


wW 
T=l eT, ? 


pea eT, . 
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Hence, in the half-plane 8 > 0, in virtue of the Schwarz inequality, we 
conclude that 


(33) (0° In &/da”) (0° In &/d8”) — (8° In &/daaB)? > 0. 


This proves our theorem. 

In all that follows, we will understand a and £8 to be numbers satisfying 
the set of equations (29). The existence and uniqueness of this pair of num- 
bers have just been proved. 

Let us make another important remark. Because of the relations (27) in 
§3, the set of equations (29), which defines the values of the parameters 
a and $, is equivalent to the set 


(34) ð In &/ða = —N/V, ð ln &,/d6 = — E/V. 


As was stated at the start of §3, we shall subsequently let the numbers N, 
E and V become infinitely large while holding their ratios constant. Hence, 
the right sides of (34) will be constants. Since the form of the function 
#,(a, 8) is independent of N, V and E, the selected values of the param- 
eters a and £ are also independent of them. Hence, we will always consider 
the numbers a and 6 as constants. 


§5. Application of a limit theorem of the theory of probability 


Now we shall establish asymptotic formulas for the distribution of the 
random pair (Sy, Ty) which was constructed in §3. We stated at the end 
of §3 that this problem reduces to one of the most thoroughly discussed 
limit problems of the theory of probability, since the conditions necessary 
to apply the central limit theorem (in its local form) are satisfied and, 
moreover, we have the simplest case of identically distributed and mutu- 
ally independent terms. 

We must first find the mathematical expectations of the quantities Sy 
and Ty . Since 


Pla, B) = pamo * Olp, q)/C(p), 
then, by (22) and (29), 
ESy = Do a0 p P(Sv = p, Tv = 9) 
= [®(a, B)” DoF go pe Olp, g)/C(p) 
—[®(a, B) ldla, 8)/da] = — 3 In Pla, B)/da 
N. 


Similarly, 
ET, = —ô ln ®(a, 8)/d8 = E. 


Thus, the mathematical expectations of the quantities Sy and Ty equal 
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N and E, respectively, for the values of the parameters a and 8 which we 
selected. These results will impart especial simplicity to our asymptotic 
formulas. 

Now we must determine the second central moments of the quantities 
Sy and Ty. We find 


By = E{[Sy — ESP} = Ef{Sy'} — {(1/8)(d/aa)}? 
= pu pP P(Sv = p, Ty = q) — {(1/®)(d6/da)}? 
(1/8) È p.o De“? Rp, g)/C(p) — {(1/#)(98/da)}? 
(1/8) (8®/3a") — {(1/8)(38/3a)}? 
3 In &(a, 8)/de’, 
and therefore, by (27) (§3), 
By = Vd" ln (a, 8)/de” = Vbu, 


ll 


ll 


where bu = & In &,/do” is a constant. 
In exactly the same way, we find 


By = E{SyTy — ES,ETy} = & In (a, 8)/da0B = Voy, 
Ba = E{[Ty — ET} = 6° In ®(a, 8)/d8" = Vba, 
where bız and ba are constants. Here, in virtue of (33) of $4, 
By By — Biz, = V"(bube — be) = V'A > 0. 


To estimate the probability P(Sy = N + u, Ty = E + w) we use 
Theorem 2 of Chapter I. The vectors (X;, Y:), which compose the vector 
(Sv, Ty), are non-degenerate mutually independent integral-valued ran- 
dom vectors (§3), identically distributed according to the law P(X, Y). 
In view of I, §2, the vector (X;, Y;) has the maximal lattice a + ka + 
18, bo + ky + lô, where | ad — By | = |d| > 0.* Thus, the possible values 
of the vector (Sy, Ty) belong to the lattice Vao + ka + 18, Vbo + hy + 
lô. From equation (22) of §3 it follows that these possible values are simul- 
taneously those values of the arguments of the structure function Q(p, q) 
for which Q is not zero. In other words, all the physically possible pairs of 
values of the number of particles and the total energy of the system are 
included in this lattice. Therefore, we must put 


N + wu = Vas + ka + Jp, E + u: = Vbo+ ky + lô. 


Since N and E are, respectively, the mathematical expectations of the 


* The two different meanings of each of the symbols a, 8 can be clearly distin- 
guished by context. 
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quantities Sy and Ty, it follows that u, and ue, in formulas (38) and (39) 
of Theorem 2 of Chapter I, have the same values as here. Moreover, we 
must evidently write V instead of n. Hence, formula (39) of Chapter I 
gives 


P(Sy = N+m,Ty = E+ w) 
= jd | (21V A? 1g (1/28 V) (by 1422-26) guj ue +2201?) + o[Vv(1 + u)l, 


where we put u = | u| + | u|. Similarly, formula (38) of Chapter I yields 
the more accurate estimate 


P(S =N+um,Ty =E +w) 
(36) = |d | (Qe VAt) eiA Oraa thr atts toau? 


+ (mo + mu + mu) V”? + ov ™> (y? + u')], 


(35) 


where mo , mı , Mm are constants independent of V, wand w. 
If it is noted that 
g (12A V) (by 149-261 uug +2201?) = 1 + OV?) 
then equation (35) yields 
P(Sy =N+u,Ty = E +w) 
(37) h- -2 2 
= (2rVA’) |a| + 0[V (1 + w). 


Similarly, if the more accurate estimate 

g 02A V (O11 422 gurug toeau?) 
= 1 — (1/2AV) (buur — bruit, + baur) + O(Vu') 
is used, then equation (36) yields 
P(S = N + u,Ty = E +w) 

(2r VA) d| + V°im + mu + mu 

— (4rd?) | d | (buu? — 2bruru: + baw )} 

+ OO! + uy]. 


Formulas (37) and (38) will be used in all of the following sections. 
Let us state again that for these formulas to be valid we must choose val- 
ues of u, and we such that the point (N + wu, E + ue) belongs to the lat- 
tice Vag + ka + 18, Vbo + ky + lô. In particular, this condition is always 
satisfied if O(N + wu, E + u) > 0, i.e., if the system composed of N + 
u; particles can have the total energy E + w. 


(38) 
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§6. Mean values of sum functions 


For the purposes of the present section we can limit ourselves to the 
crude estimate given by the very simple formula (37). Only later will we 
need the more accurate estimate given by formula (38). 

First note that the theorem of §3 [formula (22)], in conjunction with 
(37), yields 


Q(N + u, E + us) 

= O(N + u)8(a, Be t Eet (20V At] d | + OVAL + w)]}. 
In particular, for u, = u: = 0, it follows from this that 

AN, E) = C(N)®(a, B)e™***"{ (2r VA) d | + O(V™)}, 

and, therefore, that 
QIN + uy, E+ uw)/Q(N, E) 

= {C(N + me™*"2/C(N)}{1 + OV A + wy}. 
Hence, having an asymptotic estimate of a ratio of the form 

QIN + um, E + wm)/Q(N, E), 


we can easily find very simple approximate expressions for the mean values 
of the occupation numbers on the basis of the formulas derived in §2. 

In the case of complete statistics for which C(p) = p!, formula (6) of 
§2, together with formula (39), yields 


(40) <a,> = et 4 OV (1 + AN. 


In the case of symmetric statistics we have C(p) = 1, and formula (10) 
of §2, together with formula (39), yields 


<a> = ga etet 4 OVE) 
= Caza _ 171 + OV te? Set er er 


Also, in the case of antisymmetric statistics C(p) = 1, and formula (13) 
of §2, together with (39), yields 


(42) <a = (er 4 1 OV et Dga ee yy, 


(39) 


(41) 


Evidently, (40), (41) and (42) can be combined into one very simple 
formula for any fixed value of r: 


(43) <a> = (et — oy + O(V"), 


where ø is the index of symmetry of the system. 
In all these derivations, we relied on formula (39), which in turn was 
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based on formula (37) of §5. Therefore, we must still convince ourselves 
that if (N, E) is a possible combination of the number of particles and of 
the total energy of the system, i.e., if the point (NV, £) belongs to the lat- 
tice Vao + ka + lB, Vbo + ky + lô, then the point (N — k, E — ke,) be- 
longs to the same lattice for any integer k > 0. But this is obvious since 
(1, €-) is a possible combination of the number of particles and the total 
energy of the system. Thus, if the points (N, E) and (1, £) both belong 
to the aforementioned lattice, so also does the point (N — k, E — ke,) 
for any integer k. 
Now let Ñ be a sum function related to the system, i.e., let 


A= Diva A, 


where the quantity M; depends only on the Hamiltonian variables of the 
ith particle. If we assume that the mathematical expectation E,%; = A, of 
the quantity %;, in the state corresponding to the energy level £, , is the 
same for all ¢ and, consequently, depends only on r, then for the mean 
value of the quantity EY we have, according to III, §5 


<EA> = Druade<a->. 
Therefore, (40), (41) and (42) yield 
<EA> = Pra Mett — e)” 
HOVEL ef [r | EZ keet), 


In view of the remark we made at the beginning of §3, every absolutely 
convergent series of the form 


Dra f( Er) 


is a quantity proportional to V. Therefore, we obtain 
<EW> = Pra Aet — e) + O(1), 


if all the series on the right side of the next to the last equality converge 
absolutely. [This requires that the quantity | A, | should not increase too 
rapidly — a condition which is always met in physical problems. We also 
assume that the double series is O(V) (see below).| 

In the majority of cases A, has the same value for all states with the same 
energy level ¢,, i.e., A, is a single-valued function (€+) of e,. Hence, we 
can rewrite the last formula in terms of the numbers g, which we introduced 
at the beginning of §3: 


<EM> = VOR glk) (ett — o)" + O(1). 


This relation shows that <EM> is asymptotically proportional to V. 
(This is clear directly since V is proportional to the number of particles 
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N.) Morcover, this relation shows the remarkable accuracy that use of the 
limit theorems of the theory of probability gives even for a very crude esti- 
mate of the remainder terms. This accuracy can be increased substantially, 
as we shall soon see, if a more precise estimate of the remainder terms in 
the limit theorems is used in the computations. 


§7. Correlation between occupation numbers 


Our next problem is to estimate the dispersion of sum functions. To do 
this we must first estimate the correlation between occupation numbers 
with different subscripts. This is necessary since III, §6, (24) shows that 
the value of the dispersion depends in an essential way on the order of 
smallness of the difference <a,a,> — <a> <a>. It can be foreseen, 
as in the photon case, that this correlation is very weak since the number 
of particles is very large and their mutual correlation depends only on the 
relations 


Dha = N, > 21 ae, = E. 


As before, we see that the main terms cancel when <a,> <a> is sub- 
tracted from <a,a,>. Consequently, the degree of accuracy to which the 
formulas of §6 were derived is inadequate, and we must carry out all the 
computations anew, basing them this time on the more accurate formula 
(38) rather than on (37). First, (22) and (38) give for all three statistics 


Q(N + uy, E + w) 

= O(N + m)®(a, pJete] d |/2nV A") 

+ V[(mo + mu + mus) — (|d |/4arA?) (boots? — Qiu, + bunu )] 

+ OVV? + uh); 
and, in particular, for u = wù: = 0, 

Q(N, E) = C(N)®(a, B)e Fi] d |/2rV A?) + mV? + O(V)}. 
Whence 

QIN + wu, E + w)/Q(N, E) 
(44) = [ON + m)/C(N)e EL + Vfl + u 
—(1/2A) (bau? — 2bznus + buur )] + OIV P(T + uô), 


where l and l are independent of V, u and wz. 
This more accurate formula must replace formula (39) of the preceding 
section in subsequent computations. For convenience in writing, we put 


T, = T,(a, B) = (et — o)” 


for all three statistics. 
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Now we find the more precise expression for the mean values of the oc- 
cupation numbers. 

In the case of complete statistics formula (6) of §2 combined with (44) 
yields 


<a> = ePi — Vh + he, + (1/24) (ba — 2biver + bner )) 
+ OPO + e)N. 
Noting that in the case of complete statistics 
T, = e tien) 
and therefore that 
aT ,/ da = ee aT,,/d8 = — ge tin 
FT jie se, dT,/dadB8 = e ">, 
OT ,/a8° = efe (et? 


we obtain 
<a> = T, — V-[—-héT,/da — lLðT,/3B + (1/2A) (b20 T,/ð 
— 2bp:6°T,/dad8 + bud T,/98°)] + OVT, V? + d*T,/a6*)). 


In the case of symmetric statistics, formula (10) of §2 combined with 
(44) yields 


<a> = Die — VOR + he) 
+ (K/2A) (be — 2bver + bue?)] + OVV? + kte). 
This time we have 
| nen asus 6 ea me Se Wea 
from which it follows that 
ƏT, /da = — > ga ke Bathe 
6T,/dB = —> P41 kee feather) 
OT oe = Yoke, 
PT,/d008 = Yes Ke, Rater) 
aT,/08 = dea kepet etn, 


(45) 


(46) 


Therefore, we find 
<a> = T, — V'[—hoT,/da — LdT,/d8 + (1/2) (bn0°T,,/da° 
— 2by0°T',/Aad8B + bye T,/a8")| + OVT, V? + a'T,/a6*)], 


146 STATISTICS OF MATERIAL PARTICLES [CH. v 


i.e., we find exactly the same formula (45) as in the case of complete sta- 
tistics. 

Finally, in the case of antisymmetric statistics, formula (13) of §2 com- 
bined with (44) yields a formula which differs from (46) only in that the 
kth term of the sum is multiplied by (-1)*"(k = 1, 2, ---). In this case 
since 


T, = (ete + 1) = Set Cate 


it is easy to see that (45) is also valid for the case of antisymmetric sta- 
tistics. Hence, formula (45), which is the necessary improved asymptotic 
expression for <a,>, is valid for all three statistics. (It is understood, of 
course, that T, has different values in each of the three cases. ) 

Now, let us estimate the quantity <a,a,> (r = s). In the case of com- 
plete statistics, we must start from formula (7) of §2 which, combined with 
(44), yields 


<an> = go (otter) g tate) c=! Vk + he;) 
+ (hL + hes) + (1/24) (4b — 4byele, + €] 
+ bule, + ec)] + OVP + et + et) 
T,T, — V[-ha(T,T;)/da — b0(T,T;)/dB8 
+ (1/24) (bnd*(T,T,)/de” — 2b0"(T,T,)/dadB 
+ bud’ (T,T.)/28°)] 
+ OIV (T, T.V? + T,‘T./36t + T.d°T,/d6')]. 


(47) 


Since the method of obtaining formulas of this kind has been illustrated 
in several examples, we can leave it to the reader to show independently 
that formula (47), which we derived for the case of complete statistics, is 
also valid for the other two statistical schemes. 

Now we shall estimate the difference <a,a,> — <a> <a>. First, 
note that formula (45) yields 


<a> <a> = 7,T, — V"{—hd(T,T;)/da — lð(T,T.)/36 
+ (1/2A)[bo(T,8T,/d0° + T.8T,/da’) 
(48) — 2by(T,d°T./da08 + T,d°T,/d008) 
+ bn(T,0°T./08" + T.0°T,/88°)}} 
+ OVT, TV? + 7,0°T,/d8° + 7.0°T,/d6')|. 


Furthermore, subtracting (48) term by term from (47), we find 
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<an> — <a,> <a> = —(1/AV){bn(dT,/da)(dT;/da) 
— b[(0T,/da) (07/08) + (dT./da)(dT,/d8)] 
+ bu(a7,/a8) (8T./a8)} 
+ O[V?(7,T.V? + T,8T./d6* + T.d°'T,/dB*)). 


This formula, like formulas (45), (47) and (48), holds for each of the 
three fundamental statistical schemes. It shows that the difference 


(49) 


<44;> — <a,><a>, 


which is a measure of the correlation between the numbers a, and a; (r # s), 
is infinitely small for the assumed conditions and is asymptotically pro- 
portional to V~". Hence, the problem posed in this section has been solved 
completely. 


§8. Dispersion of sum functions and the suitability 
of microcanonical averages 


We shall now estimate the dispersion of sum functions which, as we know, 
is required to establish the suitability of their microcanonical averages. 
Besides the asymptotic estimates of the numbers <a,> and <a,a,> — 
<a> <a;>, which we have already found, we also need an estimate of 
the microcanonical dispersion of the number a, , i.e., we must estimate the 
number <a,> — (<a >}. We shall study this latter problem first. 

We saw at the end of §2 [formula (20)] that for all three fundamental 
statistical schemes 


<a> = <a> + (1 +o)7z,, 
where 
Te = <at> (r AS, & = &s). 
Therefore, 
ia <a> — (<a> Y = <a> + o(<a,>)’ 
+ (1+ o)[<0a> — <a,> l> |r ps ers 


since <a,> = <a,> ife, = £s. 

This formula solves completely the problem posed above since the asymp- 
totic expressions for the numbers <a,> and <a,a,> — <a> <a> 
(r ¥ s) are given, respectively, by formulas (45) and (49) of the preceding 
section. 

According to III, §6, (24), we have for the microcanonical dispersion of a 
sum function A the expression 
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D(A) = Pra (ur — Av) <a> 
+ ea AAl <a> — <a> <a>] 
Dom (ur — AY) <a> 
+ Drees AAA > — <a> <as>] 
+ Pear [<ae> — (<a->)’). 


Putting, for any r and s, 


(51) 


ae <aa;> — <a,><a;> (r Æ 8), 
& [<a,a.> — <a> Li ohr (r oo s), 


we can rewrite (50) in the form 
<a> — (<a>) = <a,> + o(<a,>)? + (1 + o)n. 
Inserting this expression in the right side of formula (51), we find 


D(H) = Pra (m — bv) <a> + Doren AAC 
Ys Aiea ol <a Y] HH A H a) a Aen 
(52) = Žr AAC + Doea Neen 
+ ea fur<a,> + oriee + (<a>) 
= Pret sts + DOR fae <ae> + ode le + (<a>). 


Now turning to the asymptotic estimate of the quantity D(Y), we note 
first that the right side of formula (49) evidently represents the quantity 
Crs for any r and s (equal or unequal). For brevity, let us put weit: = 
Q = Qla, 8). Using (49), we obtain 


ae ArAsCrs 
= —(1/AV){b(8Q/da)? — 2b2(dQ/da)(dQ/aB) + bu(dQ/as) 
+ OV (GV? + Qa‘Q/a8")I. 


Let us recall (see the beginning of §3, for example) that every absolutely 
convergent series such as 


Dra f(e) 


is a quantity proportional to V. We have defined Q as precisely such a series 
since A, and T, depend on e, . It is evident that any partial derivative of 
the function Q(«, £) has the form of such a series. Hence, the quantity Q 
in the right side of the last equality, and all its partial derivatives, are 
quantities proportional to V. This yields 
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Dk ArAsCrs = CiV + ocv), 


where C; is a constant. 
Furthermore, by the same reasoning, 


Ree <ar> + ode [Cre + (<a>) = CoV, 


where C is another constant. Therefore, because of the constancy of the 
ratio V/N, formula (52) yields 


D(X) = (Cı + CV + O(V') = CN + O(N’). 


Hence, the microcanonical dispersion of a sum function A is asymp- 
totically proportional to the number of particles N. We saw in III, §6 that 
compliance with the relation 


D(X) = o(N’), 


is sufficient to establish the suitability of microcanonical averages. We see 
now that this relation is (more than) satisfied for a sum function. Moreover, 
we obtained a very practical method to estimate asymptotically the micro- 
canonical dispersion. This is of substantial value to fluctuation theory in 
physics. 


§9. Determination of the numbers g, for structureless particles 
in the absence of external forces 


By analogy with the procedure used for photons in §5 of the preceding 
chapter, we now determine for material particles the number Vg; of possible 
energy levels included between k and & + 1 when the particles are enclosed 
in a vessel of volume V. We again assume that the state of a particle is de- 
termined by the Hamiltonian variables z, y, Z, pz, py, pz and that no ex- 
ternal forces act on the particle. Hence its total energy consists only of 
kinetic energy and the potential energy due to the vessel walls. We will 
assume that this latter energy vanishes within the vessel and is infinitely 
large outside. We again emphasize that the problem concerning us now has 
no relation to statistics. In particular, we can imagine that we are concerned 
with a single particle. 

As was remarked in §5 of the preceding chapter, the energy £ of a (non- 
relativistic) material particle is related to its momentum p by the equation 


£ = me + p/m, 


where m is the mass of the particle and c is the velocity of light in vacuo. 
Selecting the zero level of energy suitably, we can replace this relation by 
the simpler one 


£ = p'/2m, 
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which gives the expression 
= —(k'/8rm) (8/3 + 8/3? + 8/32) 
for the energy operator 3e. 
Thus, the time-independent Schrédinger equation becomes 
—(h'/80'm) (8U/ae? + 8U/ay’ + &U/a2") = EU, 
or 
(53) PUJar + 8U/ay + U/I = — (8r mE/k)U. 


(We take only the kinetic energy of the particle into account since the po- 
tential of the wall is zero within the vessel.) The solutions of this equation 
describe the stationary states of a particle corresponding to the energy 
level E. 

We noted in §5 of the preceding chapter that a linear basis of the solu- 
tions of (53) can be constructed from functions of the form 


U = C sin (az + 7) sin (by + £) sin (cz + £), 
where in this case 
(54) a+ +e = 8rmE/k. 

We again assume that the vessel enclosing this particle is the parallelo- 
piped O< et <h,0<y<sh,O<S2< 4, lhl = V. Since the quantity 
| U(x, y, 2) |’, which is proportional to the probability of finding the par- 
ticle at the point (z, y, z) of the vessel, must be zero at the vessel walls, 
any of the conditions x = 0, y = 0,2 =0,2 =h,y =h,z2 = l imply 
U = 0. The first three conditions evidently lead to the requirement 7 = 
¢ = & = 0, so that 

U = C sin az sin by sin cz, ate tec = 8rmE/k. 


From the fact that U = Oforz = l , it evidently follows that a = mm, , 
where n, is an integer. Similarly, the last two conditions lead to the re- 
quirements b = mr/ls,c¢ = nar/lz , where nz and n are integers. Because 
of (54), the numbers m , na and na must be related in the following way: 


nifty + ne/le + ns /le = 8mE/h’. 


This means that the possible energy levels of the stationary states of a 
particle are precisely those numbers E which have the form 


(R/8m) (n/i? + ne fle ng fl), 


where m , ns and na are integers which can be assumed non-negative. Each 
such triplet of numbers gives one of the linearly independent solutions of the 


§9] DETERMINATION OF THE NUMBERS g+ 151 


Schrödinger equation (53) corresponding to the given boundary conditions 
(U = Oif at least one of the equalities = 0, y =0,2=O0,0=U,y=h, 
z = ly is satisfied). 

Now let k be any integer. In view of the above statements, the number of 
linearly independent eigenfunctions of the energy operator corresponding 
to eigenvalues E < k will equal the number of solutions of the inequality 

(h?/8m) (m/l + ne/le + ns /ls) < k 
for integral ny > 0, n: > 0, n > 0. But this number is evidently asymp- 
totically (for large k) equal to one eighth of the volume of the ellipsoid 
(h’/8m) (2°/l + y’/l’ + 2*/Is) = k, 
i.e., is equal to 
R(k) = Af Sabhlela(Smk/h?)} = (2) rV (ml/h?) KE. 

The number of linearly independent stationary states with energy levels 

included between k and k + dk is approximately 


(55) R’(k) dk = 402°V(mi/h*)k! dk, 


where the interval dk must not be too small (but, understandably, it must 
be small compared with k) or the approximate formula obtained will not 
have real significance. On the other hand, in our other notation this number 
is equal to 


VEK gr, 

so that 
DOE g, X (4r m/h )kè dk. 

Consequently, the “average” value of g, in the interval k < r < k + dk 
equals 

(4r m?/h’)kè. 
As in the photon case, we can put 

gi = (Arm? /h’)k 

directly for the majority of computations, particularly for large values of k. 
Also, as in the photon case, this number must be doubled if the particles 
we consider have “spin”. 


Let us make the following remark. In classical physics, the Hamiltonian 
function of a particle within the vessel has the form 


H = (1/2m) (pë + py + pe). 
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The volume V(‘) of that part of the (six-dimensional) phase space of a 
particle in which H < k is equal to the product of the volume V of the 
vessel and the volume of the sphere H = k. This latter volume is obviously 
equal to 


$nr(2mk)}, 
Hence, 
Vik) = rV (Qmk)}. 


Therefore, the volume of that part of the phase space wherek <H <i + 1 
equals 


Vik +1) — Vik) & WR) = 2rV(2m)kè. 
Comparing this with formula (55), we find 
Vo. Œ (Vik + 1) — V(k)]/h’. 


This means that, on the average, the number of linearly independent 
stationary states with energy levels between k and k + 1 equals the volume 
of that part of the classical phase space of a particle for which 


k<H<k+1, 


if the quantity h’ is taken as the unit of volume. In other words, one stationary 
state is equivalent, in a well-known sense, to a “cell” of volume h’ in the 
classical phase space. 


Chapter VI 
THERMODYNAMIC CONCLUSIONS 


§1. The problems of statistical thermodynamics 


The most important problem of statistical physics has always been the 
explanation of the laws of thermodynamics on the basis of a model of the 
atomic structure of matter. It is well-known that thermodynamics has 
reached a high level of theoretical development independently of these 
models. Its logical scheme has been reduced to a purely deductive (axio- 
matic) structure where a small number of fundamental principles play the 
role of axioms. From these principles all further laws are deduced by purely 
logical means. The fundamental principles (axioms) are understood as laws 
of nature found by experimental means. Therefore, the problem of basing 
the foundation of thermodynamics on a model of the structure of matter 
always necessitates deriving these “axioms” from the model. The problem 
can be considered as solved when the derivation of these principles has 
been completed. 

How may this derivation be carried out on the basis of our statistical 
theory? In order to find the answer to this question, we must state a funda- 
mental difficulty in principle that arises here. In “classical” (i.e., phenome- 
nological, non-statistical) thermodynamics the variables characterizing a 
given system depend upon a very small number of fundamental variables 
such as energy, volume, temperature, pressure, entropy, etc. If the values 
of an independent subset of these variables are known, then the thermo- 
dynamic state of the system is considered to be uniquely determined. On 
the other hand, in the statistical theory, we can have a situation such that 
for given values of the thermodynamic variables there exists an enormous 
number of different states of the system which are compatible with these 
values. Each eigenfunction U of the energy operator belonging to the eigen- 
value E of this operator, determines one of these possible states. Moreover, 
there is always an entire continuum of these eigenfunctions and even if we 
restrict ourselves to a linear basis of this continuum we will nevertheless 
have a very large, albeit finite, number Q(N, E) of such states. In a typical 
situation in classical thermodynamics a quantity X is uniquely determined 
by the values of the energy E and the volume V of the system. In the sta- 
tistical theory this same quantity can have widely different values in dif- 
ferent states which are compatible with the given E and V. However, if 
the statistical theory is to provide a logical basis for thermodynamic prin- 
ciples, it must certainly yield a unique value for such a quantity. Further- 
more, the value must coincide with that given to the quantity in classical 
thermodynamics. 
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As we have mentioned repeatedly, there is only one solution to this dith- 
culty: In the statistical theory, we must give some mean value <U> for the 
value of the quantity A which is to be compared with the corresponding 
quantity of classical thermodynamics. The average must be taken over ull 
states U which are accessible to the system, and must correspond to the 
given energy level E (and also, if necessary, to given values of some other 
parameters). The principle of averaging can be chosen arbitrarily, but we 
are obliged to show the suitability of the averages given by the statistical 
theory, i.e., to show that in an overwhelming majority of the states the 
value of the variable A is extremely close to <A>. When this does not 
happen, it is meaningless to compare the value given by the theory with 
either the value given by experiment or the value given by the classical 
theory. In Chapters IV and V, where the microcanonical principle of aver- 
aging was used, this proof of the suitability was actually demonstrated in 
detail for the most important cases. 

Thus, the path we must follow is clear: For each of the variables of interest 
in classical thermodynamics, we must find a corresponding phase function 
in our statistical theory. Moreover, we must show that the microcanonical 
averages of these phase functions are subject to the same relationships as 
are the corresponding variables in classical thermodynamics. 


§2. External parameters, external forces and their mean values 


The Hamiltonian function of a physical system as well as of an individual 
particle can depend on a number of parameters which define the position or 
the state of external bodies. A change in the values of these parameters 
causes a change in the form of the energy operator and by the same token 
causes changes in the energy levels and the possible states of the system or 
particle. In the preceding development, the presence of this type of parame- 
ter in the expression for the potential energy of a system or particle was by 
no means excluded. In fact, when examining a system of particles enclosed 
within some vessel, we always considered the potential energy to depend 
on the volume V of this vessel. The volume V is the most important parame- 
ter of the type being discussed. Other similar examples of parameters are 
the variables defining the position or the state of the sources of external 
fields which act on the system. Until now, however, we have not paid any 
particular attention to the possible presence of this type of parameter. We 
were able to do this because the values of these parameters were assumed 
to be strictly constant. It was in this sense that we called our system iso- 
lated. Now, however, we must consider interactions between the given sys- 
tem and the bodies surrounding it for which this type of parameter under- 
goes changes. These changes are related to the work done by the particles 
of our system, and, therefore, are related to the change of energy both of the 
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particles and of the entire system. Thus, a gas which is enclosed in a vessel 
of cylindrical form, upon expanding, generates work by the force of its 
particles which moves a piston and thereby changes the volume of the 
vessel. 

We shall call the parameters just described the external parameters of the 
system. Thus, the Hamiltonian function of each particle depends, not only 
on the usual Hamiltonian variables of this particle, but also on a sequence 
of external parameters \; , A2 , ++ , As. From a mathematical point of view, 
the external parameters are characterized by the fact that they have the 
same value for all particles. In fact, if desired, this latter property can be 
taken as the formal definition of an external parameter. [Let us also men- 
tion that the parameters a and 8 which were previously introduced, and 
are usually called the inner parameters of the system, have this same prop- 
erty (i.e., the same value forall particles): We know that fora given struc- 
ture of the particles and for known values of the external parameters, the 
numbers « and £ are uniquely determined if the number of particles N 
and the total energy E of the system are given.] 

Let one of the particles of our system be in the rth stationary state 
(energy level ¢,), and let the elementary work dw performed by this par- 
ticle produce a change dà: in the external parameter à; . This work is ac- 
companied by a change de, in the energy of the particle: 


de, = (0,/d;) dài. 
Since, due to the law of conservation of energy, dw = —de,, it follows that 
dw = —(de,/dd;) dd;. 


In mechanics the coefficient of dà; in this expression for elementary work is 
called the generalized force with which the particle acts upon the external 
bodies “in the direction” of the parameter à; . Thus, for the above particle, 
this generalized force is equal to —d¢,/dA; . (It should be understood that 
the possible energy levels £, of the particle depend on the values of the ex- 
ternal parameters. In Chapters IV and V, for example, we saw that these 
levels depend in an essential manner on the volume occupied by the sys- 
tem.) 

If the change dà; of the parameter ), is due to the total work done by all 
the particles of the system, then the generalized force A; , with which the 
whole system acts upon the external bodies ‘‘in the direction” of the parame- 
ter à; , is equal to the sum of the generalized forces of all the individual par- 
ticles of the system. Let the system be in the basic state U which corre- 
sponds to a particular choice of the occupation numbers a,(U). We then 
have 


Ai = — Dora a-(U)(de,/dds). 
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This relation shows that the generalized external forces A; are phase func- 
tions in the statistical theory (and are defined at least for the set of basic 
eigenfunctions). If the elementary work ôW done by the whole system 
causes the changes dà , dì, ++- , dàs in the external parameters, then we 
have 


ôW = Sin Aid = — Doles {ORG (U) (de,/dx)} dri. 
Since 
Dele. = E 
is evidently the total energy of the system, 
(1) A; = — 2 ri a,(U)(0e,/0d;) = —3E/ðN; , 
and we may write 
W = — $ia (@E/AX;) dà; . 


It is necessary to bear in mind that ôE/ðà; is a phase function defined by 
the relation (1), so that for given values of the total energy and the external 
parameters, ôE /ð); is still not determined, but depends on the state U in 
which the system is found. 

The generalized external forces 


A; = —dE/0n; 


are thus phase functions (and at the same time, evidently, sum functions). 
In virtue of the general methodological discussion in §1 we must therefore 
assume that the generalized forces considered in phenomenological thermo- 
dynamics have as their analogs in the statistical theory the microcanonical 
averages of these phase functions, i.e., the quantities 


(2) <A;> = —<dE/d\;> = — 524 <a,>(de,/ddi). 


Likewise, the classical elementary work performed by the system on the 
external bodies should be interpreted in our theory as the microcanonical 
average 


<ôW> = — iia <dE/an;> ads = — Doi [DOR <a,>(ae,/da,) dri). 


We obtain approximate expressions for the variables <A;> if we sub- 
stitute in (2) the approximate expressions 


<a> x (e — 7), 
of V, §6, where ø is the index of symmetry of the system. This gives 


(3) <A> X — J 2 (oe,/arni(err” — o)? (@ = 1,2,8) 
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We recall that in V, §4 we found for the function @(a, 8) the general 
expression 


&(a, b) = [Ta {Eio y(n)", 
where 
(1/n! for complete statistics, 
CE 1 for symmetric statistics, 
VNI = Ji (n< 1) 


0 (n > 1) for antisymmetric statistics. 


Thus, in the case of complete statistics, 
®(a, B) = [Ri exp [6], 
and 
India, 8) poe 
In the case of symmetric statistics 
&(a, B) = Taft — rrr", 
and 
In ®(a, 6) = -È 2 ln (Q = et), 
Finally, in the case of antisymmetric statistics 
&(a, B) = [2 fl + e, 
and 
In (a, 8) = Xl ln (1 + et»), 


An elementary calculation based on these formulas shows that in all 
three cases the relation 


ð In ®(a, B)/3): = —B > 21 (e,/ddi) (e — o) 
is valid. Therefore, for all three cases formula (3) gives 


(4) <A;> = (a ln @/d),) (1<i<s8), 


and, consequently, 
(5) <ôW> = Pia <A> di: = BD (3 In $/ô);) d): . 


In Chapter V we saw that the number of particles N and the total energy 
E of the system are obtained by differentiating the function In @(@,8) with 
respect to the parameters a and 8. Now we see that the mean values of the 
generalized forces and of the elementary work done by the system are ex- 
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pressed simply by means of partial derivatives of this same function In ẹ 
with respect to the appropriate external parameters. 

Formulas (4) and (5), as we shall see, form the basis of all the thermo- 
dynamic conclusions to be drawn from our statistical theory. 


§3. The definition of entropy and the deduction of the second 
law of thermodynamics 


An expression for the function ®(a@, 8), which occupies such a prominent 
place in our theory, was established in the preceding section for the three 
basic statistical schemes. It depends, as we have seen, not only on a and £, 
but also on the external parameters M , Az, *:* , As. The logarithmic de- 
rivatives of this function, with respect to its different arguments, give us 
expressions for the most important variables characterizing the state of the 
system, such as the number of particles M, the energy E and the external 
forces <A;> (i = 1, 2, --- , s). It is well-known that the so-called ‘“char- 
acteristic function of Planck”, sometimes called “the thermodynamic po- 
tential”, possesses analogous properties in phenomenological thermody- 
namics. In the classical theory this function depends on the temperature of 
the system and on the external parameters. In our theory there are two 
“inner” parameters — a and 8. However, in view of the assumed constancy 
of the number of particles N of the system, the relation 


ð ln O(a, B)/da = —N 


permits us to eliminate one of these parameters (e.g., a) from the expressions 
for the function In $(a, 8) itself, and from any of its partial derivatives. 
Thus, we may assume that the function In ®(a@, 8) and each of its partial 
derivatives depend only on the parameters 8, M, +*+, As. The analogy 
with the classical theory naturally leads to the assumption that the parame- 
ter 8 is uniquely related to the temperature of the system. All our knowledge 
about this parameter confirms the assumption. Thus, in all cases where the 
system consists of several components in thermal contact with each other, 
a naturally defined value of the parameter £ exists which is common for all 
these components. The parameter 8 must be related, therefore, to a physical 
quantity which, in the case of thermal equilibrium among several systems, 
has the same value for all of the parts. In thermodynamics, the tempera- 
ture is just such a quantity. 

We recall now that the values of the parameters a and 8 were chosen 
(V, §4) such that the function 


F(a, p) = e***¥@(a, B) 


has its minimum value at the point (a, 8). The function F(a, £), like (a, 8), 
depends on the external parameters M , A2, +*+ , As aS Well as on a and £. 
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We have 

In F(a, 8) = aN + BE + In Ga, 8), 
and, consequently (in view of the constancy of N), 
dln F(a, 8) = N da + E d8 + BdE + dln Ga, 8) 

= N da + Edp + BdE + (ð ln /ða) da 
+ (ð In &/a8) dB + $i- (3 In &/ad,) d; . 
In view of the relations 

əlnğ/ða = —N, ôðmğ/əß = —E 
and formula (5), §2, we find 
(6) dln F(a, 8) = BE + <éW>}. 


According to the laws of classical thermodynamics, the elementary incre- 
ment dE of the energy of a system, caused by corresponding changes in the 
temperature and in the external parameters, is composed of the work per- 
formed on the system by the external bodies, which evidently is equal to 
—òW , and of the quantity of heat 6Q which is received by the system. Thus, 


dE = —<éW> + 6Q. 


In the above it should be understood that 6Q, like <ôW >, is not the total 
differential of any function of the parameters a, £, \;. However, the rela- 
tion (6), which may be rewritten in the form 


(7) dln F(a, 8) = 8 6Q, 


clearly shows that the product 8 6Q is a total differential, i.e., that @ is an 
integrating factor for 6Q. 

One of the most convenient and generally used formulations of the sec- 
ond law of thermodynamics can be stated as follows: There exists a function 
S of the temperature and the external parameters, and another function 
6, depending only on the temperature of the system, such that 


(8) 0 òQ = dS. 


In other words, 5Q has an integrating factor depending only on the tem- 
perature of the system. The relation (7) coincides with the relation (8) if 
we set 


6 = B, S = In F(a, 8). 


In thermodynamics the function @ is given by the expression 1/47, where 
T is the absolute temperature of the system, and k is the so-called Boltzmann 
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constant. Therefore, in statistical physics, the physical meaning of the 
parameter £ is always defined by the universal formula 


8 = 1/kT. 


The function S is called the entropy of the system: it is one of the most 
important. variables of thermodynamic theory. Relation (7) now assumes 
the form 


6Q/T = k dS, 


and turns out to be a direct expression of the second law of thermodynamics, 
which thus appears as a direct consequence of our statistical theory. In 
particular, the function F(a, 8), which was introduced in Chapter V as an 
auxiliary mathematical tool, now acquires a most important physical mean- 
ing — the quantity In F(a, 8) is the entropy of the system. In Chapter V 
we considered our special choice of values of the parameters œ and £ as a 
purely mathematical device designed to yield a simpler form for our asymp- 
totic expressions. Here we see that this choice leads to important physical 
consequences, one of which we now introduce as an example. 

Assume that we have two systems initially completely isolated from each 
other, and also from the surrounding world. We shall denote, respectively, 
by the indices 1 and 2 the variables relating to the first and second system, 
and leave without indices the variables relating to the system obtained 
from the union of the two given systems. It is assumed that sufficient time 
has elapsed for thermal equilibrium to be established in the combined sys- 
tem. Evidently, we will have 


N=N,4+N2, E=E +E:, (a,8) = %(a, 8)®:(a, 8). 
Hence, the entropy of the combined system is 
S = Na+ EB + In Ga, B) 
= [Ma + E6 + In (a, 8)] + [Noa + EB + In &.(a, B). 
But the function 
In Fy(a, 8) = Nia + Eb + In 4,(a, 8) 


has its minimum value fora = a, 8 = $;, while the function In F2(a, £) 
has its minimum value for a = a2, 8 = fp» (since the numbers ay, 1 , a2, 
B are defined by this requirement). Therefore, relation (9) gives 


S >In Fy(a , B1) + In F (ae ’ Be) = Sı + So, 


(9) 


i.e., af two systems initially isolated from each other are brought into thermal 
contact, then after equilibrium is established the entropy of the combined system 
will never be less than the sum of the entropies initially possessed by the com- 
ponent systems. 
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The equilibrium state is characterized by the fact that the parameters a 
and £ take on the same values for both components. I'urther, in the relation 
S > S, + &: the equality sign will hold true if, and only if, a1 = œ: = a, 
Bi = Bo =B. 

We recall that the law of conservation of energy, which is the first law of 
thermodynamics, was established at the end of Chapter II in the form ap- 
propriate for quantum mechanics. This law, in contrast to the second, can 
be proved without statistical methods. As in classical mechanics, it is a 
simple consequence of the general laws of the evolution of physical systems 
in time (i.e., in the case of quantum physics a consequence of Schrédinger’s 
equation). 

On the basis of the two fundamental laws now established, thermody- 
namics can be constructed in a purely deductive fashion independently of 
any special model of the structure of matter. Thus, we have completely es- 
tablished a foundation for thermodynamics on the basis of our statistical 
theory. 


Supplement I 
THE STATISTICS OF HETEROGENEOUS SYSTEMS 


Throughout the main part of this book we have restricted ourselves to 
homogeneous systems, i.e., to systems consisting of particles of identical 
structure. The purpose of this restriction was to simplify the formal appara- 
tus so that the reader could concentrate on the conceptual bases of the 
method. However, without any changes in principle, our method can be 
used to describe the statistics of heterogeneous systems which consist of 
particles of several different types. The complications in the computational 
formulas which are caused by this transition are of a purely technical char- 
acter. Fundamentally, the difference consists only in that in place of the 
two-dimensional limit theorems of the theory of probability analogous 
multi-dimensional theorems must be applied. The formulations, proofs and 
the conditions for their applicability correspond completely to those estab- 
lished in Chapter I for the one-dimensional and two-dimensional cases. As 
a rule, for systems consisting of particles of k different types, it is appropri- 
ate to use a (k + 1)-dimensional limit theorem. The remaining details of 
the methods do not differ from those described in the simple case of homo- 
geneous systems. Of course, for the case of a heterogeneous system, parti- 
cles of different structure can obey different statistics. 

In the present supplementary section we shall discuss briefly systems 
consisting of material particles of two different types. The transition from 
two to three or more types is quite trivial and involves only a simple ex- 
tension of the formulas for the case of two different types. 

Let the system under study consist of N particles of two different types. 
As usual, we allow these particles (in particular, those of different type) to 
exchange energy freely. At the same time the interaction energy of the 
particles is assumed to be so insignificant that in all our calculations we 
can take the total energy of the system equal to the sum of the total ener- 
gies of all its constituent particles. 

Let the numbers of particles of the first and second types be equal to 
N, and N2, respectively, (Ni + N2 = N). We denote the possible energy 
levels for particles of the first type (in the usual order) by £1, £2, -++ , and 
for particles of the second type, by m , n2, °*: . (Both spectra are assumed 
to be discrete. This is the same assumption we have made everywhere in 
the text.) For the sake of brevity we call the set of particles of the first 
(second) type, the first (second) component of the system. Let U3, U2, 
--» be a complete orthogonal set of “admissible” basic eigenfunctions of 
the energy operator of the first component, and let Vi, V2, ++- be an 
analogous set for the second component. In virtue of the general results of 
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Chapter III, the set of functions UV; (i,j = 1,2, -+-) isa complete or- 
thogonal set of basic eigenfunctions of the energy operator for the whole 
system. In a completely natural fashion we extend to our heterogeneous 
system the concept of a structure function. This function, denoted by 
Q(N,, N2, E), is the number of eigenfunctions of the form U;V; which be- 
long to the eigenvalue E of the energy operator of the system. Since the 
energy of the system is equal to the sum of the energies of its components, 
we denote by 2,(N,, E) and (N: , E) the structure functions (defined 
in the usual manner) of these components. Thus, 


(1) Q(N,, N2, E) = 7-0 (Ny, , 2)Q(N2, E — 2). 


Assume now that it is known that the system is in some definite state 
U.V,;. This means that the first component is in the state U; , while the 
second is in the state V; . Since the states U; and V; are fundamental, the 
values of all the “occupation numbers” are precisely determined, i.e., for 
arbitrary r and s the number a, of particles of the first component in the 
state with energy level ¢,, and the number b, of particles of the second 
component in the state with energy level n, , have definite values. The num- 
bers a, and b, always satisfy the relations 


(K) ee a, = Ni, ails = N2, DA aE, + Jea bem = E. 


Conversely, if a choice of occupation numbers a, > 0, bs > 0 is given 
satisfying the relations (K), then to this choice, in general, there corre- 
sponds a definite number of states of the system of the form U;V;. By 
using the method described in III, §5, it is easy to show that this number 


is equal to 
(2) C1(N1)C2(N2)[ ay ACAI ES yo( bs) |. 


The notation introduced above is analogous to that used in the text: 
C\(N,) = Ny! or = 1 depending upon whether the first component obeys 
complete statistics or either of the two other statistics; C.(N2) has a similar 
Meaning for the second component. The functions yı(a,) and yo(b.) are 
defined for the first and second component, respectively, in complete anal- 
ogy with our previous function y(a). (Each of these functions also depends 
on the type of statistics obeyed by its component.) 

In order to obtain the number 0(N, , N2 , E) of eigenfunctions of the form 
U,V; belonging to the eigenvalue E of the energy operator of the system, we 
must obviously sum the expression (2) over all possible choices of the num- 
bers a, > 0, b, > 0, which satisfy the relations (K). Thus, we find 


(3) UN, Ne, B) = CONDON d Xo (TP nla) Ea vb). 


This fundamental expression for the structure function is completely 
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analogous to formula (14) of III, §5. Here, as there, it serves as the start- 
ing point for all further calculations. In V, §2 we expressed the mean values 
of the occupation numbers and their pairwise products in terms of ratios of 
structure functions. We will not carry out the analogous calculation here 
since, if necessary, the reader can independently establish the appropriate 
formulas. 

However, we shall stop to examine, somewhat in detail, the second step 
of the investigation. This step is the reduction of the problem of finding an 
asymptotic estimate for the structure function to a limit problem of the 
theory of probability. 

It is necessary to introduce two degenerate two-dimensional distribution 
laws uw, and v,, where k and / are arbitrary integers. We also introduce 
three parameters a , a2 and 8, whose values are defined so that all the series 
to be considered will converge absolutely. The random vector (z, y), which 
is distributed according to the law ux, has as its possible pairs of values 
only points of the form x = n, y = nk (n = 0,1, ---), where 


P(x = n, y = nk) 
= (net | S oym) Math it (n= 0,1, =). 


Analogously, for the law v; the possible pairs of values are the points of the 
form z = n,y = nl (n = 0,1, ---), where 


P(x = n, y = nl) 
= (nye (P| a ya(me PFI (n = 0,1, ++). 


For brevity, we denote, respectively, by gi, and ga: the degrees of de- 
generacy of the energy levels ¢, = k and n, = l for particles of the first and 
second types, when the system occupies unit volume. We consider the sum 
of an infinite series of mutually independent random vectors (£r, Yiri) 
(i = 1, 2, ---), among which there are gı, vectors distributed according to 
the law u, (k = 1, 2, --+), and we set 


Doe tu = Xı , Dita Yui = Yı. 


It is easy to show that both series converge with probability 1, (see IV, §3). 
Similarly, let (£z: , yz) (¢ = 1, 2, +--+) be a sequence of mutually inde- 
pendent random vectors, among which there are ga vectors distributed ac- 
cording to the law v; (l = 1, 2, «--), and set 


D Tri = Xe, Dan Ya = Yo. 


Now we set Yı + Y2 = Y and we denote by P the distribution law of the 
three-dimensional vector (X,, X2, Y). Evidently this law depends only 
on the nature of the particles composing the system and on the parameters 
ai, a and £. In particular, it is independent of the numbers Ni, N2, V 
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and E (if, as we shall assume here, the ratios of these numbers always main- 
tain constant values). 
Assume, finally, that we have the sum 


(4) (Siv , Sev, Ty) 


of V mutually independent three-dimensional random vectors, each of 
which is distributed according to the law P, described above. Then, we may 


set 
Siy = Yea Pye Tiki s Szy = em Dey Tl; , 
Ty = Des See Yik; + Dora bp Yor; s 


where the vector (ru; , Yıx;) obeys the law u, , the vector (22; , Y21;) obeys 
the law v,;, and all these elementary vectors are mutually independent. 
We find, therefore, for the distribution law of the three-dimensional vector 


(4) 
P(Siy = pi, Sov = pP, Ty = q) 
(5) = Doreen (Wes WE" Plan; = ae; , yu; = box;)] 
a [ZF P(e, = bu yor, = bid}, 


where the summation extends over all possible sets of numbers ar; , bu , 
satisfying the relations 


ie pay a; = pi, Tt pee bi, = pe, 
Deke Dot’ kar; + Dra Dov! bu = q. 
According to the definition of the laws u; and v: , we have 
Ptr; = Oe; , Ys = kar) = Ti(k)y1(ae, ee, 
Pta; = bu Yer, = lbu) = Ta(L)y2(bi, ee, 


(Kopsa) 


where 
Ta(k) = {È no ym) t] 
T2(1) { peer) yo(m)e "02 FPP Yr 
Substituting these expressions into the right side of relation (5), we find 


I] 


P(Siy = Pi, Sav = po, Ty = q) 
= {TTR ey} (Ts O} E o (Teen Tt va) 
‘Wt I valdi) exp [—o Dor Dot ak; — ae ra Diet bi, 
— BE Ra ye kan + Era DE i) | 
= eerta TTR [a(k a a} 
Dopo Ua IEA va Ma T1222! veb). 
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The last sum on the right side of this relation differs only in notation from 
the sum 


ar A ee n(ar)][ Poa y2(bs)], 


which appears in the right side of formula (3). Thus, 
P(Siy = p, Sw = p, Ty = q) 
(6) = (Me mgo) (TPs 0 erase ts 
O(pi , Pz, DICi(pr)C2(pe)] '. 
Summing this equation over all integers pı , p: , q, we find 
1 = {Tee eer Dp oa e 0an 
"QC , pe, q) [Cil pi) Ca( p). 


The sum on the right side of this equation is a function of the parameters 
a, , a, and 8. We denote this sum by (a , a2 , 8), so that 


(Ien m {TTP O] = (Ca, a, 8) 
The relation (6) gives 
P(Siv = pi, Sw = pe, Ty = q) 
= [Bla , az, B) te Op , pa, g) [Clp Clp); 
and, hence, 
Q(pr, Pa, g) = Ela , a2 , BYC(pr)Co( paje": 1272 48 
‘-P(Siy = p, Soy = p, Ty = o. 


This formula, in exact analogy with formula (22) of V, §3 reduces the 
problem of finding an asymptotic estimate for the structure function 
Q(p1, P2, q) to a limit problem of the theory of probability, since the vec- 
tor (Siy, Sev , Ty) is defined as the sum of V mutually independent iden- 
tically distributed three-dimensional random vectors. It should be remarked 
that the heterogeneity of our system results only in an increase of the num- 
ber of dimensions in the corresponding problem of the theory of probability. 
As previously, it is necessary to sum only random vectors which are iden- 
tically distributed. 

The last step—the application of the (three-dimensional) local limit 
theorem — we shall not consider since, in essence, the procedure does not 
differ from that described in the two-dimensional case. We mention only 
that the values of the parameters a; , a2 and 8 are determined from the 
relations 
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aIn®/aa = —Mi, ômð/ðæ = —N:, 9@ne/ap = —E. 


The existence and uniqueness of the solutions of these equations may be 
established by a procedure analogous to that used in Chapter V. Since 
In #(aı , az, 8) (and each of its partial derivatives) is proportional to V, 
and since the ratios Ni/V, N/V, E/V are assumed to be constant, the 
parameters a, , œx and 6 must be constants. 

In regard to the fundamental results of the theory, we remark that the 
mean values of the occupation numbers are given by the expressions 


<a> = (er — og) + 0(V"), 
and 
<b> = (e — o) 14 0(V4), 


where sı and oz, respectively, denote the indices of symmetry of the first 
and second components. It should be understood that the energy E, is not 
a constant, but is a phase function of the system, and, in particular, a sum 
function. Its mean value can be written directly if the mean values of the 
occupation numbers are known. Thus, to a good approximation, we find 


<E\> Z yai e (eth an oi), 
and, analogously, for the second component 


<Ey> & eer ns(er? 7" = oo). 


Supplement II 


THE DISTRIBUTION OF A COMPONENT 
AND ITS ENERGY 


§1 

Let a system be composed of two components, and let us retain the 
terminology and the notation introduced in the preceding supplement. The 
states of this system, corresponding to an energy level E, constitute a 
linear manifold. A basis of this manifold is provided by a set of eigenfunc- 
tions of the form U;V; , where U; is one of the fundamental eigenfunctions 
of the first component, belonging to some energy level E, , and V; is one 
of the fundamental eigenfunctions of the second component, belonging to 
some energy level E, . The energy eigenvalues of the appropriate functions 
U;and V; must satisfy Eı + E: = E. As we saw in the preceding supple- 
ment [formula (1)], these facts permit us to write down immediately the 
important relation 


(1) AUN, N2, E) = DiPo%(Ni, 2)%(Ni, E — 2), 


which will serve as the starting point for our further calculations. 

First we find the number of those fundamental functions U;V; of our 
system, belonging to the energy level Æ, in which the first index 7 has a 
definite value, i.e., in which the first component is found in a definite 
fundamental state U;. If this state corresponds to the energy level £, = 
E,(U;) of the first component, then the unknown quantity is obviously 
the number of functions V; corresponding to the energy level E — E, , i.e., 
the number 


(Ne, E = Ey) = QN? , E za E,(U;)]. 


Suppose now that X is a physical quantity that has a definite value in 
each of the fundamental states U;V; of this system (a phase function!). 
Moreover, let this quantity be completely determined by the state U; of 
the first component, and hence be independent of the state V; of the second 
component, so that 


A = f(U;). 


Then this quantity retains the same value in all of those %[N: , E — £,(U;)] 
fundamental states U;V ; of the system in which the first factor is equal to 
U; (i.e., in which the first component is in the state U;). Thus the miero- 
canonical average of such a quantity can be written in the form 
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<U> = <f(U;)> 
= XN; , No, EJ’ Ra (UNUN: , E — E,(U;)). 


This relation, being almost self-evident, nonetheless has a profound mean- 
ing and, as we shall see later, important consequences. Tirst, it shows that 
for quantities which depend only on the first component, the microcanoni- 
cal average (which is always an average over all the fundamental functions 
U,V, of our total system belonging to the energy level E) can be replaced by 
a certain average over the fundamental eigenfunctions U; of the first component. 
We note immediately, however, that this “reduced” average has features 
which sharply distinguish it from the microcanonical average: In the micro- 
canonical average only states with the same fixed energy participate, while 
the definition (2) includes all fundamental eigenfunctions U; of the first 
component, regardless of the energy levels to which they might belong. 
Further, in the microcanonical average all participating states have equal 
weight while in (2) the fundamental state U; has the weight 


Q[N2, E — E(U;)\/Q(N1,N2, E), 


(2) 


which depends on the value #,(U;) of the energy of the first component 
in the state U; , and may therefore be different for different fundamental 
states U;. 

We consider now the important and frequently encountered special case 
in which the quantity Ñ depends only on the energy E, of the first com- 
ponent, i.e., for all states U; , corresponding to the same energy level F, , 
it has the same value 


AX = o(Fi) = ylE(U:)]. 


Then in formula (2), f(U:) = (x) in all terms for which £,(U;) = z. 
The number of such terms is the number of fundamental eigenfunctions U; 
of the first component, corresponding to the energy level 2, i.e., %(Ni, x). 
Thus, formula (2) assumes the form 


<U> = <e(E£,)> 
= [2(N: ? Ne ’ ENE o o(2) (Ny ; T)Q(Na ) E = z). 


In particular, the microcanonical average of the energy of the first com- 
ponent is 


(4) <E> = [Q(M1, No, E)T* do204%(N1, 2)%(N2, E — 2). 


(3) 


Formulas (2) — (4) are exact. However, for actual calculations they 
are completely useless because of the complexity of the expressions for the 
functions Qı , R and Q. Therefore, for practical computations, we must. re- 
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place them by simpler approximate formulas, which can easily be obtained 
with the help of the method developed in this book. We now study several 
important applications of this method. In most cases, we limit ourselves 
to the derivation of asymptotic expressions, and do not pause to make 
detailed estimates of the omitted terms. 


§2 
To obtain the necessary asymptotic expressions, we first turn to V, 80. 
(89). If we replace Q by Q and N by Ns in this formula, and assume u = 
0, u = —2, then we find 


(No, E~ 2)/%(No,E) = 6 114+ OVL + 2”)]}. 
Applying this estimate to the weight function 
N: , E — Fi(U;)\/Q(M1, N2, E) 
= UN, E — EUMD o UN, 2) (No, E — x)| ! 
of formula (2), we easily find for it the very simple asymptotic expresssion 
eA S o G(Ni, ee TT + O(V"). 
Thus, formula (2) can be replaced by the approximate formula 
(5) <A> = <f(U)> ~ Era fU) T E a AN, ze]. 


Our derivation of this formula is very simple, but it contains an inac- 
curacy which makes formula (5) incorrect in the general case. Thus, in 
formula (39) of V, §6, upon which our calculation is based, E denotes the 
fixed total energy of a system whose structure function is Q(N, E). But in 
our case E denotes the fixed value of the energy of our total system, while 
Q is the structure function of the second component. We shall, however, 
consider only systems in which the first component is negligibly small 
compared to the second. More precisely, we shall construct our asymptotic 
formulas under the assumption that the numbers N,, E and V approach 
infinity, maintaining constant ratios, while the number N, remains con- 
stant. The most interesting application is the case N, = 1 for which the 
first component is an individual particle. In this case the energy Ez of the 
second component is asymptotically equal to the energy E of the total 
system and, as can be seen without difficulty by a more detailed calculation, 
the error we mentioned above does not reduce the accuracy of equation 
(5). 

For the case just described, we call the first component a small compo- 
nent of the system. Thus, for quantities which are determined by the state of 
a small component (in particular, of an individual particle), the microcanont- 
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cal averages can be obtained approximately by averaging over all the funda- 
mental functions of this component with the weight function 


(6) GP Se Nis ae | 


In statistical physics this result is called Boltzmanun’s law, and it is usual 
to consider the weight function (6) as the probability of finding the small 
component in the state U;. There would be no objection to such a ter- 
minology if, in the subsequent development of the theory, the term ‘‘prob- 
ability” always retained this meaning. However, experience has shown that 
on the basis of this terminology, as a rule, a good number of complications 
and imprecise formulations arise. Therefore, here, as in the rest of the 
book, we shall consciously avoid probahilistic terminology, especially since 
there is no pressing necessity for its use. 

With the help of expression (6) for the weight function, formulas (3) 
and (4) can be rewritten as asymptotic expressions: 


(7) e s z Se 
x P roglr)®ù (N: , sje [Z zo A(N, r) 

and 

(8) <E> X Dita (N1, r) |Z eo UN, e T. 


Thus, the microcanonical average of any function of the energy of a small 
component (in particular, of an individual particle) can be obtained approxi- 
mately by averaging the function over all possible values x of this energy with 
the weight function 


ACN, , sje [poe aN, rje PT. 


Supplement III 
THE PRINCIPLE OF CANONICAL AVERAGING 


In this book we have always assumed that the system under study was 
energetically isolated, i.e., not exchanging energy with surrounding bodies. 
The whole method of microcanonical averaging is constructed on this 
premise, because the only states which participate in the formation of 
microcanonical averages are states in which the energy of the system has 
a strictly fixed value. However, this requirement of complete energetic 
isolation can be achieved only approximately under real conditions. In 
fact, in many cases of practical importance, the system is in more or less 
intensive energetic (for example, thermal) contact with surrounding bodies. 
As a consequence, its energy does not remain constant and, hence, the prin- 
ciple of microcanonical averaging is deprived of its theoretical foundation. 

The results of Supplement II permit us to use our statistical theory in a 
rational way to construct the statistics of such non-isolated systems. We 
consider the extreme case in which the system can freely exchange energy 
with its almost infinitely large environment (whose energy is many times 
larger than that of our system). Physicists call such an environment a 
“heat bath”, and say that the system is “immersed in a heat bath”. The- 
oretical considerations lead to the conclusion that for a system immersed 
in a heat bath only the temperature of the heat bath is significant. Its other 
properties, including even its material composition (i.e., the nature of its 
component particles), are irrelevant. In particular, nothing prevents us 
from representing the heat bath in the form of an enormous number of 
physical systems which are exact replicas of the one being studied, and 
which freely exchange energy among themselves as well as with the given 
system. 

If we adopt this point of view, then we can consider the combination of 
the given system S and the heat bath T as one isolated system S + T, 
with respect to which our system S, and each of the similar systems which 
comprise the heat bath T, play the role of individual particles. Since the 
system S + T is isolated, we assume, of course, that all the preceding 
principles of microcanonical averaging are justified. Since the system S 
can obviously be considered as a small component (individual particle) of 
the isolated system S + T, a microcanonical average over the states of 
the system S + T is equivalent for system S to an average according to 
formula (2) of Supplement II. In this formula 2 and Q denote the structure 
functions of systems T and S + T, respectively; Ni = 1 and N: is equal 
to the number of systems (identical to S) which constitute the heat bath. 
Also, E is the (constant) total energy of the combined system S + T, 
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E,(U;) is the energy of the given system in the fundamental state U; , 
and the summation extends over all such states. This conclusion is exact. 
However, as we have seen, formula (2) can to a good approximation be 
replaced by the incomparably simpler and more convenient formula (5), 
where the summation extends over the same range and where Q; is the 
structure function of the system S. As for the parameter 8, we know that 
it is universally related to the absolute temperature of the system by the 
equation 


where k is Boltzmann’s constant (and where the symbol T for the absolute 
temperature, which we use here only in passing, is not to be confused with 
the symbol for the heat bath). Of course, the temperature of the system 
is the same as the temperature of the heat bath. It is clear that the prin- 
ciple of averaging for the system S, expressed by formula (5), is independ- 
ent of the special nature of the heat bath and depends (through the param- 
eter 8) only on its temperature. This temperature is fixed, both for the 
system and for the heat bath, as a result of the energetic contact between 
them. 

We see, therefore, that if we accept microcanonical averaging as the primary 
basis for statistical computations for isolated systems, then this necessarily 
implies a certain definite principle of averaging for systems which are freely 
exchanging energy with a large environment (heat bath). This principle is 
expressed approximately by formula (5) of the preceding supplement: 


G) <&> = <fU)> x DEKE E Qn, very, 


where N is the number of particles, Q is the structure function, E(U;) is 
the energy in state U;, 8 = 1/kT, and T is the absolute temperature. (All 
these quantities refer to the system immersed in the heat bath.) The sum, 
of course, extends over all fundamental states of the system. 

An average constructed in accordance with this principle is called a 
canonical average. The basic differences between it and the microcanonical 
average are: 1) All the fundamental states of the system participate in 
the canonical average, not just those belonging to a definite energy level, 
as in the microcanonical average. (This, of course, corresponds to the real 
difference between a system immersed in a heat bath, and hence able to 
change its energy, and an isolated system, whose energy remains un- 
changed.) 2) In the canonical average, as opposed to the microcanonical, 
the weights of the different fundamental states are different. The weight 
of any particular fundamental state depends on the corresponding energy 
level, so that all fundamental states belonging to the same energy level 
receive identical weights. 
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It is obvious that the principles of averaging (7) and (8) of Supplement 
II are valid for a system immersed in a heat bath, because they are imme- 
diate consequences of formula (5) of that supplement. 

The principle of canonical averaging has many important practical ad- 
vantages. In particular, canonical averages are incomparably simpler than 
microcanonical ones. Therefore, many authors introduce this principle from 
the beginning as a hypothesis and use it as the basis of all their statistical 
calculations. They sometimes refer to the theorem we have just proved 
(i.e., that microcanonical averaging for isolated systems implies canonical 
averaging for systems immersed in a heat bath), and state that only sys- 
tems of the second type are to be considered in the sequel. However, in 
the majority of cases the principle of canonical averaging is introduced in 
a purely postulational form. It is then applied to various systems, regardless 
of whether they are isolated, immersed in a heat bath, or, as usually hap- 
pens in practice, are in some intermediate state which only more or less 
approximates one of these two extreme types. In particular, this is pre- 
cisely the procedure followed by Gibbs, who first introduced both these 
principles into statistical mechanics and who coined their universally 
accepted names. We may well ask whether or not such a practice is logical. 

In order to answer this question we recall first of all that even the micro- 
canonical principle was introduced as a postulate, whose arbitrariness we 
emphasized repeatedly. Although this choice was later given some justi- 
fication by our proof of the suitability of microcanonical averages, never- 
theless, as we repeatedly emphasized anew, only experiment could give 
final verification to this hypothesis. In particular, we examined in full 
detail a case where this choice was completely refuted by experiment, and 
had to be replaced by another choice (the introduction of the symmetric 
and antisymmetric principles of averaging). Gibbs and his followers in 
introducing the principle of canonical averaging also take this point of view 
that verification can come only from experiment. With the help of this 
principle Gibbs constructs statistical thermodynamics. The various con- 
cepts of this theory are then identified with the corresponding concepts of 
phenomenological thermodynamics. If, in fact, it turns out that the statis- 
tical theory is able to substantiate the fundamental formulas of phenome- 
nological thermodynamics, which have been well verified in practice, then 
this is all that can be required of the theory, and the postulate which was 
introduced is thereby justified. 

However, there are profound differences between the two principles of 
averaging — canonical and microcanonical. Suppose we consider an isolated 
system whose energy has a definite value E. This means that the system is 
really in one of the states belonging to the energy level Æ. Under these 
conditions, to include, in the formation of mean values of physical quanti- 
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ties, states whose energy levels differ from F, and in which the system 
cannot possibly be found, is clearly inadmissible from a theoretical point 
of view. The practical success of the principle of canonical averaging when 
applied to isolated systems, if it is valid, is not due to the inclusion of these 
states, but results in spite of their inclusion. Another characteristic differ- 
ence between the two principles pertains to any type of system. While 
microcanonical averaging has as its basis only a very general assumption 
(identical weights for all admissible states), and hence is indisputably the 
simplest and most natural of all possible principles, the canonical principle 
ascribes a very special form to the weight function. We would, of course, 
like to know why this particular function appears and not some other one, 
and whether all the conclusions would remain valid if this special assump- 
tion were to be replaced by some more general one. If the canonical prin- 
ciple is taken as a postulate, then all these natural questions are avoided. 
Gibbs says of his own choice only that the canonical weight function is 
very convenient for performing calculations. If, however, one takes the 
path systematically followed in this book and 1) accepts for the treatment 
of isolated systems the very simple and natural microcanonical principle, 
2) modifies this principle in a natural way where theory and experiment. 
require it (the transition to the “new” statistics!), and, finally, 3) proves 
rigorously the canonical principle for systems immersed in a heat bath, then 
all the doubts and perplexing questions pointed out above disappear com- 
pletely, and we get the same practical conclusions in a manner which is 
theoretically very satisfactory. We feel that for this purpose alone it would 
be worthwhile for the reader to master the fairly simple mathematical 
development required by this method. 

Finally, one can take another point of view of the relationship between 
the canonical and microcanonical averages. Since we used the microcanoni- 
cal principle for isolated physical systems, we were forced to find simple 
asymptotic expressions for the averages obtained because of the complexity 
of the exact expressions. (This constituted the chief problem of our 
method.) On the other hand, in most cases canonical averaging (as the 
development of the theory shows) leads to results which differ but little 
from those of microcanonical averaging, and one can think of the canonical 
principle merely as a simple and unified mathematical prescription for 
finding approximate values for the microcanonical averages. As such, the 
method of canonical averaging is completely acceptable, especially since 
we are forced to provide approximate expressions for the microcanonical 
averages anyway. However, having taken this point of view, we must 
consider under what conditions the canonical averages can actually serve 
as approximations to the microcanonical averages. In particular, we must 
determine the magnitude of the error in these approximate expressions. 
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We shall do this for the most important physical quantities, i.e., for sum 
functions. 

We denote by F(U) an arbitrary phase function of the system and by 
<F,> the microcanonical average of the quantity F(U) when E = z is 
the total energy of the system. (The average is, of course, carried out over 
the manifold M, .) However, the canonical average of the function F(U} 
(when the system has total energy E) is, by formula (1), 


KF > = Pia F(U) F O E a Uae AT, 


where for brevity we write Q(x) instead of Q(N, x), and where the valuc 
of the parameter £ is determined in the well-known way from the number 
of particles N and the total energy E. Clearly, we can write 


int F( Uje 5T = yee e” Drup- F( U;) 
= Dire Ma)e [Zrv FUNE = Dire F> Ue), 


since [Q(2)]7 Do swp- F(U:;) is just the microcanonical average of the 
function F(U) when the total energy of the system has the value z. 

Thus we find, for the canonical average of the function F(U), the ex- 
pression 


KFD = D 2o <Fe> Uae | > 2 Are]. 


This means that the canonical average KF of the phase function F(U) 
is a certain weighted average of its microcanonical averages over all pos- 
sible energy levels, where, to the level x, we ascribe the weight 


p(x) = A(x) Arana T 


This weight depends, of course, on the total energy E of the system, since 
the parameter 6 depends on E. Our problem will now be to compare the 
canonical average KF: of the function F(U) with its microcanonical 
average <F,> for the same energy E. 

It is easy to see, by using formulas (22) and (35) of Chapter V, that 
the weight function p(x) [where Q(x) stands for 2(N, x)] can be expressed 
approximately by the normal law 


p(x) X (rB) te OF 


where, as in Chapter V, A = E asa result of our choice of the parameter 
8. (The two-dimensional normal law of V, §5 reduces here to a one-dimen- 
sional law, since N is constant and hence the parameter u, vanishes iden- 
tically.) By a detailed calculation which we omit here, we find that the 
dispersion B is an infinitely large quantity of the order of E, V and N. Thus, 
we obtain 
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(2) KFD (xB) 2 < F, > D, 


The weight function has its greatest value for x = E and is negligibly small 
for values of x sufficiently far from Æ. Hence, in the weighted average (2) 
of the function <F,>, only those values of <F,> receive appreciable 
weight for which x is sufficiently near Æ. This is a typical example of ex- 
pressing a quantity approximately, with the help of an “integral kernel”. 
Here, the function corresponding to the integral kernel is the weight func- 
tion p(x). 

To illustrate the computation of the error in the equation <Fy> % 
<F,>, we assume that the quantity <F,> does not change too rapidly 
with z near x = E. For example, we assume that (at least for not too large 
a value of | y |) the “Lipschitz condition” 


| <Feyy> = <Fe>|<Cly| 


is satisfied for some constant C > 0. Then we obtain 


| KFD — <Fe> | X (27B) 





Sees ep se 


È (<Freu> = <Fe> er 


y= 


= (2B) 








ao 


< ClwB)>? D lg ler 
y= 


~ C(2aB)7 f |y | e"? dy = CBY2x)7 [ |z| e* de = ŒN, 


where C* is a constant. If the function F(U) is a sum function (or the 
mathematical expectation of a sum function), then the quantity <F;> 
(as we have seen many times) will be infinitely large of order N. The last 
inequality shows, therefore, that in replacing the microcanonical average 
of a sum function by its canonical average, we introduce only a negligibly 
small relative error. This is precisely what we wanted to show. 


Supplement IV 


REDUCTION TO A ONE-DIMENSIONAL PROBLEM 
IN THE CASE OF COMPLETE STATISTICS 


The reduction of the problem of finding estimates for structure functions 
to limit problems of the theory of probability was presented in V, §3. In 
the case of complete statistics, this reduction can be replaced by a con- 
siderably simpler one which leads to a one-dimensional limit problem. As 
we shall now see, such a replacement is possible because in the case of 
complete statistics the introduction of two parameters (a and 8) can be 
successfully avoided by the introduction of one parameter (8). In the case 
of the other two statistics both parameters are necessary. (We take this 
opportunity to mention that the discussion of this question given in §11 
of the author’s article [9], although it contains no errors, is unsatisfactory 
since this important one-parameter property of the problem is not empha- 


sized. ) 

We consider the elementary distribution law 
(1) P(z = k) = ge ™/8(8) (k = 0,1, ---), 
where 


(b) = Doge = > fae 


Let the random variable R = J`% z; be the sum of N mutually independ- 
ent random variables, each of which is distributed according to the ele- 
mentary law defined above. Then for a given integer E, 


P(R = E) = XY, [Di Pla = k), 


where the summation extends over all combinations of non-negative in- 
tegers k; satisfying the condition 


(L) Ti ki = E. 
We abbreviate this sum by the symbol J i . In virtue of (1), 
(2) P(R = E) = [E(0 "6 iw TT ge. 


If a definite system of numbers k; (1 < 7 < N) is chosen satisfying 
equation (L), then we may assume that the energy k; of the ith particle 
is fixed. Corresponding to this energy there are g; different fundamental 
states of the particle. Therefore, since a fundamental state of the system 
in the case of complete statistics is uniquely determined by specifying 
the fundamental state of each of the particles, it follows that to a given 
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choice of the numbers k; (1 < i < N) there corresponds 


IT Gk; 


different fundamental states of the system. The total number of funda- 
mental states of the system corresponding to the given energy level E is, 
therefore, equal to 


Xw I Iki 
But this number is just Q(N, E). Hence, formula (2) gives 
P(R = E) = [®(8)| "A(N, E), 
whence 
(3) Q(N, E) = [8(8)|"e*P(R = E). 


Since KR is the sum of a large number N of mutually independent random 
variables, distributed according to the same law (which is independent of 
N and E), relation (3) permits us to find asymptotic expressions for 
Q(N, E) with the help of well-known one-dimensional limit theorems of 
the theory of probability. It is easily seen that the asymptotic expressions 
obtained in this way coincide with those found in Chapter V. 


Supplement V 


SOME GENERAL THEOREMS OF 
STATISTICAL PHYSICS [10] 


§1 

In phenomenological thermodynamics the state of a physical system is 
determined by the assignment of a small number of parameters. Thus, the 
state of a given mass of gas, not under the influence of any external force 
fields, is usually determined by the assignment of its volume and tempera- 
ture. Every other quantity characterizing the state of the gas is then de- 
termined as a function of these two basic quantities, and hence may be 
considered uniquely determined if the values of the volume and tempera- 
ture are known. On the other hand, in statistical thermodynamics, to given 
values of the energy (or, equivalently, the temperature) and the external 
parameters there corresponds not one state but an uncountable set of dif- 
ferent states of the system. (In classical mechanics this set is the entire 
“energy shell”, and in quantum mechanics it is a linear manifold whose di- 
mension is the degree of degeneracy of the given energy level.) Any quantity 
which is determined by the state of the system will generally assume differ- 
ent values for different states of this family, and can no longer be considered 
a unique function of the energy and the external parameters. Thus, there 
are essential differences between the phenomenological and the statistical 
concepts. Since, on the whole, the implications of the phenomenological the- 
ory are considered to be substantiated by experience, the statistical theory 
must provide an answer to the following question: How can it be that, for 
given values of the energy and the external parameters, a quantity which 
can in principle assume different values always maintains the same value 
experimentally, in agreement with the deductions from the phenomenologi- 
cal theory? 

The reasons for this phenomenon were correctly guessed by the founders 
of the statistical theory. Boltzmann, and later Jeans, Lorentz and others 
pointed out repeatedly that quantities which characterize a given system in 
the large (and a phenomenological theory is only concerned with such quan- 
tities), though generally assuming different values in different states (which 
correspond to fixed values of the energy and the external parameters), 
nevertheless remain ‘‘almost constant”. That is, for the overwhelming ma- 
jority of the states of such a family, a quantity takes on values which are 
very near to one another. Hence, an experiment only rarely detects a value 
which differs significantly from a certain definite number. This number is 
the one predicted by the phenomenological theory as the only possible value 
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of this quantity under the given conditions. It is also the one predicted by 
the statistical theory as the most probable or mean value of the quantity. 
This property of phenomenological quantities (which Jeans called their 
“normality”) is explained by the fact that the value of such a quantity de- 
pends on the states of an enormous number of constituent particles. As a 
consequence, a mechanism which is analogous to the law of large numbers 
operates, and the “near-constancy” of the phenomenological quantities is 
thus analogous to the stability of arithmetic means in the theory of proba- 
bility. 

Presumably the above argument correctly characterizes the state of af- 
fairs, and, as far as I know, it has never been subject to doubts. Hence, it 
is all the more important to point out that the assertion contained in this 
argument has not only never been proved, but, so far as I know, it has 
never even received a precise and general mathematical formulation. Just 
what kind of quantities possess Jeans’ “normality” and how can this prop- 
erty be rigorously justified? Apparently this question has not even been 
raised in a sufficiently general form. Until recently, one would encounter 
only passing remarks concerning the “normality” of some particular physi- 
cal quantity. However, Fowler in his well-known treatise [11] posed and 
solved the problem of “normality” for a broad class of quantities, in fact 
for those quantities which I call sum functions. A great many of the quanti- 
ties with which one is concerned in the phenomenological theory are sum 
functions. In my books I established the “normality” of sum functions un- 
der very broad conditions by means of a new method, based on the applica- 
tion of limit theorems of the theory of probability. 

However, sum functions are not the only quantities of interest in a phe- 
nomenological theory. For example, the square of a sum function (an esti- 
mate of which is interesting, if only for the calculation of the dispersion) is 
no longer a sum function. On the other hand, it is by no means true that 
every quantity uniquely determined by the state of the system is of interest 
in the phenomenological theory. Therefore, it is necessary to define pre- 
cisely the class of physical quantities which can be given a reasonable phe- 
nomenological interpretation and with respect to which one might hypothe- 
size (and attempt to prove) the “normal” character. 

In the first place, such a quantity must depend symmetrically on the 
constituent particles of the system. Let us discuss quantum-mechanical 
systems, for definiteness. As before we assume that the possible energy 
levels of a particle are integers. The number of linearly independent states 
of a particle corresponding to a given energy level r (r = 1, 2, -- +) is asymp- 
totically proportional to the volume V occupied by the system. We denote 
this number by Vg, (if r does not belong to the possible energy levels of 
the particle, then g, = 0), and we denote the states themselves, enumerated 


182 GENERAL THEOREMS OF STATISTICAL PHYSICS [SUPP. v 


in any order, by u, (1 < s < Vg,). The state of the system is known if 
one knows in which of the states u,, each of the particles may be found. (In 
the case of Bose or Fermi statistics it is sufficient to know the number of 
particles in each of the states us.) A physical quantity whose value ix 
changed by a permutation of any pair of particles is meaningful in the 
statistical theory only for Maxwell-Boltzmann statistics. Such a quantity 
is never of interest in a phenomenological theory because the state of the 
system obtained by some permutation of a pair of particles is not, in gen- 
eral, a different phenomenological state. 

If we denote the state of the ith particle by u, then each quantity of 
interest in the phenomenological theory must be a symmetric function of 
the states u”, u®, ---, u™. However, in the great majority of cases, 
such a quantity will depend only on the energies of the individual particles, 
i.e., it will not change its value if we change some of the states u to other 
states having the same energy. 

In every case it is easy to show that a quantity which does not possess 
this property, in general does not have the normal character. Quantities 
of this type of the greatest physical importance are the occupation numbers 
Grs - (ys is the number of particles in the state urs.) It is obvious that ar 
(in any of the three basic statistical schemes) depends symmetrically on 
the particles. However, when the number of particles approaches infinity, 
the distribution of the quantity a, tends to a certain definite limiting dis- 
tribution, with a definite, positive mathematical expectation and disper- 
sion. Hence the occupation numbers do not possess the “normal” character. 
On the other hand, the number N, of particles with a given energy r 
(N, = Se ars) , isa symmetric function of the states of the particles which 
depends only on the particle energies; and it is well-known that the numbers 
N, possess the “normal” character. 

Thus, we arrive naturally at the conclusion that those physical quantities 
which are of phenomenological interest, and which may be expected to 
exhibit (for a large number of particles) the “normal” character, must be 
symmetric functions of the energies €1 , 2, -++ , Ex of the constituent par- 
ticles. The fact that this “normality” actually holds (at least in those cases 
where the function in question satisfies certain very broad conditions of 
“smoothness”) amounts to a general theorem of statistical physics which 
I shall prove below for quantum-mechanical systems obeying any of the 
three statistical schemes. This theorem also holds for classical systems. 
However, in this case its proof is more complicated mathematically. Ob- 
viously, it can be presumed that this theorem also holds for non-homogene- 
ous systems (i.e., systems composed of particles of several different. types) 
if the function in question is symmetric with respect to all the particles of 
the same structure. In the present treatment, however, I shall not touch 
upon these extensions. 
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Every symmetrie function of the energies of the particles depends only 
on the numbers of particles in the various energy levels r, i.e., it is a unique 
function of the numbers N, (r = 1, 2, ---) defined above. We shall denote 
it by F(N,, Ne, --- , Ne, -+-), or, more briefly, by F(N,). Obviously, 
this function can also be considered a functional, dependent on the dis- 
tribution of particles among the different energy levels, i.e., dependent on 
the function N (xz) — the number of particles whose energies do not exceed 
x. If this functional is linear, then 


F(N) = [7 ¥(@) aNG) = EYON, = LHe). 


Obviously F(N,) isa sum function. Thus, from this point of view sum func- 
tions become the simplest group among the class of functions under con- 
sideration. [From the point of view of the pure theory of probability, the 
limiting behavior of such functionals has been studied carefully and with 
great success by von Mises [12, 13]. However (aside from the fact that he 
considers only finite-valued or continuously-distributed quantities, whereas 
we are interested in the case of a discrete but unbounded spectrum), von 
Mises assumed the energies of the particles to be mutually independent 
random variables, while the basic feature of our problem consists in taking 
account of their mutual dependence, which results from fixing the total 
energy of the system (i.e., the sum of the energies of the particles). The 
results and methods of von Mises can therefore not be applied to our prob- 
lem.| 


§2 
In most cases, a function F(N,, N2, +- , N, +++), which represents a 
quantity studied in the phenomenological theory, increases without bound 
as the number of particles N approaches infinity (just as, for example, do 
the numbers N,). It goes without saying that by the “normality” of such 
a quantity we mean the requirement that, in the great majority of the 
states corresponding to a given energy level of the system, the function F 
should assume the same value, with an error which is small compared to the 

value of F itself. Thus, a sum function 


Sa W(e:) 


will, in general, be of order N and we shall call it normal if the errors men- 
tioned above are of order o(NV). As is usually done in statistical physics, 
we shall always assume that the number of particles N, the energy of the 
system #, and the volume it occupies V all increase without bound, while 
maintaining constant ratios. It is under this assumption that we shall es- 
tablish our asymptotic formulas. We shall also disregard the mutual poten- 
tial energy of the particles in our computations, so that the energy of the 
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system may be regarded as composed additively of the energies of its con- 
stituent particles. 

As we have already noted, we must impose on the function F certain 
general requirements in regard to its “smoothness”, i.e., the function F 
cannot vary too rapidly. The fact that a function which varies too rapidly 
cannot be “normal” can be seen by the simplest examples. In particular, 
the function e™%” cannot be normal, as is easily seen by calculating its dis- 
persion. The limitation on the variation of the functions of interest to us 
is conveniently expressed with the help of conditions of the Lipschitz type. 
In order to see which of these conditions is the most natural and convenient 
for our purposes, we consider first, as an example, a fairly general and, to 
a certain extent, typical class of functions 


(1) F(N,) = Duta Ns, 


where p, q > 1 are positive constants. Let us put N, = Nl, (r = 1,2, +--+ ), 
so that 0 < l, < land y l, = 1. Let N,’ = Nl,’ be another set of values 
of the numbers N, , corresponding to the same values of N and E. Then 
we have 


|FON-) — F(N) | < Lia r | (NY)? — N| 
= N82 par’ | (l’)* °, 
or, since | (1,/)*° — (L| <q! L — L |, 
| E(N) — F(N,) | < aN? Doar? | 1! hd. 
But F(N,) = N° ) 2.1 77l,", whence 


= F(N,)/F(l:), 
and we obtain 
(2) | F(N.) — F(N;) | < | FON) | g(h) Drar li — Ll, 
where (l) is a function of the numbers lh, la, 00, l, °° 


This, then, is a Lipschitz condition satisfied by functions of the form (1). 
However, for our purposes we can broaden it considerably and hence prove 
that “normality” is possessed by a much wider class of functions. First of 
all we replace the quantities | 1,’ — l, | by the expressions | J,’ — l, |“, where 
u is an arbitrarily small positive number. Further, we replace the factor 
r? by the expression e™, where à is an arbitrary positive number. Thus, we 
replace condition (2) by the much broader condition 


(3) | F(N,’) — F(N,) | < C| FON) lell) 22a |l — L e. 


This condition, which we impose on the function F(N,), must be understood 
in the following sense: A positive number u and a positive function ¢(I,) 
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exist, such that, for any value of ` > 0, no matter how small, inequality (3) 
is satisfied for all permissible systems of values N, and N,’, if the constant C 
is sufficiently large. In the following, this condition will be called the ex- 
tended Lipschitz condition. 


§3 

Suppose that a system is composed of N particles and occupies a volume 
V. Then it is well-known that, to a given energy level E of this system, 
there corresponds a certain linear manifold of states, whose dimension is 
given by the degree of degeneracy of this eigenvalue of the Hamiltonian 
of the system. We shall denote this degree of degenerary by Q(N, E). A 
linear basis for this manifold can be chosen in various ways [but it always 
consists of Q(N, E) terms]. In particular, as was shown in detail in Chapter 
ITI, one can choose, for the elements of this basis, the states which I call 
fundamental. These states have the very convenient property that the 
numbers NV, have definite values in each of them. (In general, in a given 
state of the system, the numbers N, , like all physical quantities, have only 
certain particular probability distributions.) In the following, we shall 
suppose that these fundamental states have been chosen for the basis, so 
that we can speak of a definite value for each number N, in each of these 
states. Moreover, any function F(N,,Ne,--:,N,, +--+) of these numbers 
will have a definite value in each of the fundamental states. 

That property of a quantity, represented by the function F(N,), which 
is usually called its “normality”, can be precisely defined in these terms. 
We require the existence of a quantity <F>, depending on N, but not 
depending on the particular one of the Q(N, E) fundamental states in which 
the system is found, such that the ratio 


[F(N,) — <F>]/<F> 


is extremely small for the great majority of these states. More precisely: 
If «e is any positive number, and if Q(N, E) is the number of those funda- 
mental states of the energy level E in which 


|[F(V,) — <F>]/<F>|> «, 
then we must have 


QCN, £)/Q(N, E) +0 (N > œ) 


(assuming, of course, that # and V increase in proportion to N). 

We now show that this normality, as just defined, is possessed by every 
function F(N,) which satisfies the extended Lipschitz condition defined 
above. In other words, normality is an inherent property of all quantities 
which depend symmetrically on the energies of the particles, provided that they 
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do not vary too rapidly. This assertion is the content of Theorem L, formu- 
lated below. First we must establish a lemma, with the help of which The- 
orem 1 can easily be proved. 

All of our statements will hold for each of the three basic statistical 
schemes: complete statistics (Maxwell-Boltzmann, or scheme P), sym- 
metric statistics (Bose-Einstein, or scheme S), and antisymmetric statistics 
(Fermi-Dirac, or scheme A). In those places where separate arguments 
must be used for these three schemes, we shall carry them out for each 
separately. 


§4 


We have denoted by 2(N, E) the number of fundamental states of the 
system corresponding to the energy level E. Let Q,(M) be the number of 
these states in which N, = M, i.e., in which just M particles have energy 
r. Let us find an expression for Q,(M ) for each of the three basic statistical 
schemes. To this end we must consider, besides the energy spectrum of the 
particles of our system, another spectrum differing only in that the level r 
is missing (g, = 0); all the other energy levels are the same (and have the 
same degree of degeneracy) as in the original spectrum. All quantities 
formed under the assumption of this second spectrum will be distinguished 
in the following by an asterisk (*). 

In scheme P, in order to specify a fundamental state of the system it is 
necessary to fix the state of each particle. The number Q,(M) therefore 
stands for the number of such choices of states of the particles in which 
we have M particles with energy r. In order to realize such a choice, it is 
necessary first to choose from the total number N of particles, those M 
particles whose energies are to be equal to r. This can be done in C(N, M) 
different ways. [We use the symbol C(N, M) to denote the number of 
combinations of N things taken M at a time.] Furthermore, each of these 
M particles must be put in one of the Vg, states which correspond to the 
energy level r. This, obviously, can be done in (Vg,)” different ways. Now 
we must assign states to the remaining particles. There are N — M remain- 
ing particles, the sum of their energies will be E — Mr, and their spectrum 
will be just the second spectrum we mentioned above. Thus, the number of 
choices of states for these remaining N — M particles is simply equal to 
the number of states of a system of N — M particles whose total energy is 
E — Mr on the assumption of the second spectrum, i.e., it is equal to 
Q*(N — M, E — Mr). In the case of complete statistics, therefore, we have 


(P) QM) = C(N, M)(Vq,)"O*(N — M, E — Mr). 


Let us turn now to symmetric statistics (scheme S). Here, a state of 
the system is specified by assigning the numbers of particles which are in 
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the various states. To the energy level k there correspond Vg, linearly in- 
dependent single particle states. We denote by ar, the number of particles 
in the sth state with energy k (k = 1,2, ++- 38 =1,2,--+, Von; > iat Gis = 
N). To each fundamental state of the system there corresponds a definite 
choice of occupation numbers dzs , which, of course, satisfy the conditions 


(4) Da DiS as =N; Dia ban = E. 


Conversely, to each such choice there corresponds just one of the Q(N, E) 
fundamental states of the system. Consequently, Q,(M) is the number of 
choices of occupation numbers which satisfy the relation 


(5) N, = X an = M 


and the conditions (4). But, one can choose numbers a,, satisfying this 
relation in as many ways as there are integral solutions of the equation 


Vv 
iat tı = M. 


This number is C(M + Vg, — 1, M). (The easiest way to see this is to 
note that the number of integral solutions is equal to the coefficient of z” 
in the expansion of (1 — x) ™ in powers of x.) This choice having been 
made, we have to compute the number of ways that the remaining occupa- 
tion numbers a,;, (for all k # r) can be chosen. Since this choice is bound 
only by the conditions 


S*as=N—M;  }*kas =E — Mr, 
the number of possible choices is equal to Q*(N — M, E — Mr), and we 
find 
(S) Q(M) = C(M + Vg, — 1, M)&*(N — M, E — Mr). 
[The asterisk means that the term k = r is missing from the sum.] 


Let us turn finally to the antisymmetric case (scheme A). Here there 
is only one difference from scheme S: Instead of arbitrary solutions of equa- 


tion (5), we must count only those for which a,, < 1 (s = 1, 2, ++: , Vg,), 
which, obviously, gives C( Vg, , M) solutions. Thus, we find 
(A) Q(M) = C(Vg,, M)a*(N — M, E — Mr). 


Hence, our first problem is solved. To avoid misunderstanding, one must 
keep in mind that the expression 0*(N — M, E — Mr) in these formulas 
has different values for the three different statistical schemes. 


§5 
If B stands for some condition, depending on the numbers N, , then in 
general this condition will be satisfied for some of the Q(N, £) fundamental 
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states of the system belonging to the energy level E, and will not be satis- 
fied for the others. For brevity, let us denote by P(B) the fraction of these 
fundamental states for which condition B is satisfied. We call P(B) the 
probability of condition B. In particular, 


P(N, == M) = Q,(AL) /Q(N, E), 


where Q,(M) is the number we defined above. Thus, we have the follow- 
ing formulas for this probability: 


C(N, M)(Vg,)"9*(N — M,E — Mr)/Q(N, E), 


(P) 
(6) P(N, = m) =) COM + Var — L MN — M, E ~ Mr)/Q(N, E), 
(S) 
C(Vgr, MI) 9*(N — M, E — Mr)/Q(N, E). 
(A) 





Now let œ and £ be two real parameters whose values are such that the 
following double series will converge: 


Sla, B) = Dop- Domo e “” “O(p, g)/C(p), 


where C(p) = p! in the case of complete statistics and C(p) = 1 in the 
other two cases. Then the following basic formula [Chapter V, §3, equa- 
tion (22)] is valid: 


(7) Qp, g) = C(p)®(a, B)e*”**P(p, q), 


where P(p, q) is a certain two-dimensional probability distribution. We 
shall have more to say about this quantity later on. We have the following 
expressions for the function (a, 8) (see p. 157): 


det Vg et), (P) 
(8) In (a, B) = 4(— D ki Voy In [1 — et), (S) 
| Pa Vge In [1 + et], (A) 


If we turn to our “second” system and put (in accordance with our sys- 
tem of notation) 


O*(a, 8) = Dinamo Deo e Op, g)/C(p), 


then In #*(«, 8) is given by the same sums (8), except that the term & = r 
is missing; whence 
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(-Vy, gate (P) 
In [®*(a, 8) /®(a, B)] = Vg- In (1 — & 79), (S) 
— Vg, ln [L + e C], (A) 


Applying formula (7) to the numerator and denominator of the fraction 
Q*(N — M, E — Mr)/Q(N, E), we find 


o*(N — M, E — Mr)/Q(N, E) 
= [CO(N — M)/C(N JIE (a, B)/®(a, B) t 
-P*(N — M, E — Mr)/P(N, E) 


(exp [—Vg, et], (P) 
= [C(N — M)/C(N) JO ™ "POR a SL — OP ye, (S) 
[i eye, (A) 


where 
Ry = P*(N — M, E — Mr)/P(N, E). 
We now substitute this expression in formula (6). For brevity we intro- 


duce the “index of symmetry” o of the given system [øe = 0 for (P),o = 1 
for (S), e = —1 for (A)], and we put 


Ce ae [((P), (S), (A): 
Then we easily obtain 
[e "(Vg T.)"/M Rx , (P) 
C(M + Vg, — 1, M) 
(9) P(N, = M) = y s (5) 
(T, + 1) "T/T: + DV Rx, 
C(Vgr, M)TM(1 — T,) “Ra. (A) 


In these expressions it is interesting to note that in all three cases the fac- 
tors preceding Ra represent a very simple probability distribution, cor- 
responding to a certain integral-valued random variable. In all three cases 
we shall call it the “basic distribution”. 
§6 
First of all, let us estimate the factor 
Ru = P*(N — M, E — Mr)/P(N, E). 

The denominator of this expression (which is independent of M) can be 
obtained from formula (37) of Chapter V, §5. Thus, 
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(10) P(N, E) = d/2: Vë + O0(V™), 


where d and ô are positive constants. 

The estimate of the numerator is much more complicated. However, for 
our purposes it is sufficient to consider only values of M which satisfy the 
inequality 
(11) |M — Vg-T.| > T Vg) ™, 
where ¢ is some constant (0 < e < }). Moreover, we require only a very 
crude estimate which can be derived easily by elementary means. 

By definition, P*(N — M, E — Mr) is the probability distribution of n 
certain integral-valued random vector (S*, T*). Hence 
P*(N — M, E — Mr) = P(S* = N — M, T* = E — Mr) 

< P(S*=N—M)< Day P(S*=N -— AI), 
where the summation extends over all values of M which satisfy condition 
(11). It follows that 


(12) PAN — M, E — Mr) < P{|S*—N + VgT,| > T,(V9,)"4. 


Until now the values of the parameters a and 8 have been arbitrary. 
Now we shall choose them in the usual way, so that the relations 


N + ô ln @/da = 0; E + ð ln/əß = 0 


are satisfied. (It is well-known that such a choice can always be made 
uniquely, and that the resulting values of a and £ are constant, i.e., do not 
depend on N, E and V.) It is known (p. 139) that the mathematical ex- 
pectation of the quantity S* is 
ES* = —@|n&*/da = —d ln @/da — Vg, T, 
= N — Vg,T,. 
Using this, and applying the Chebyshev inequality to inequality (12), we 
obtain 
P*(N — M, E — Mr) < P{ | S* — ES*| > 7,‘(Vg,)"‘} 
< DS*(Vg./T-)"(Vgr)”, 

where DS* is the dispersion of the quantity S*. But S* is the sum of V 
mutually independent random variables which have the same constant 


(i.e., not depending on N, E and V) probability distribution. If we denote 
the dispersion of this elementary distribution by b*, it follows that 


DS* = Vb*, 
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and we find 
P*(N — M, E — Mr) < b*(Vg/T A V gr. 


We shall be satisfied with this very crude estimate [which holds for all 
values of M satisfying condition (11)]. Combining it with estimate (10) 
for P(N, E), we find, for all such values of M and for all three statistics, 


(13) Ry = P*(N — M, E — Mr)/P(N, E) < c(VoQ,/T-)*97°. 


(Here c is some positive number, independent of V andr. A simple com- 
putation, which we omit, shows that 


b* = VS In &*/da’” < VO In&/da’, 


where the right side is independent of V and r.) 
It is always assumed, of course, that the number r is one of the possible 
energy levels of a particle, i.e., that g; > 0. 


§7 
Now let us denote by Po, Eo and Da the probability, mathematical 
expectation and dispersion, respectively, of the integral-valued random 
variable corresponding to the basic distribution defined by the factor pre- 
ceding R v in each of the three formulas (9). Again applying the Chebyshev 
inequality, we find, for all three statistics, 


PEIN, — Vg-T,| > T,(Vgr)" } = Dian Po(M)Ru 

< c(Vg-/T,)"gr Dyan Po(M) 
c(Vg-/T,) ‘gr Pol | N; — VgrT; | > T+(Vgr)" *} 
c(Vg,/T,)"V gr EolIN, — Vg T 
e(Vg./T,)“V gr ‘Di(N,). 


The above follows since each of the three basic distributions has (as is 
shown by an elementary computation) the mathematical expectation 
Vg,T, , and hence in all three cases 


E[N, — Vg T.P} = Do(N,). 
But it is also easily shown that in all three cases 


DA(N,) = Vg-T(1 + oT), 


(14) 


IA 


where o is the index of symmetry. Therefore, inequality (14) gives 


P(|N, — Vg-T.| > TAV gA T} < eTA Vg) “gr OA + oT), 


or, since g, > 1, 
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PIIN, — Vg-T-| > T,(Vgr)} < e(7,/V) “(1 + oT). 
This is our estimate of the probability of the inequality 
(15) |N; — Vg-T,| > TÁV g) 


for a fixed value of r. The probability that inequality (15) holds for at 
least one value of r (1 < r < œ ), does not exceed the sum of the probabilities 
just estimated, i.e., it does not exceed the quantity 


oy 48 Dal Ta + oT) = g yT 
(where c’ is a positive constant). Thus, we have proved the following propo- 
sition: 
Lemma. With a probability greater than 
1 — ey 
we can assert that for anyr (1 <r < œ), 
(16) |N, — Vg-T,| < T,( Vg) ™, 


where e is any positive number in the interval 0 < e < 1, and c’ is some posi- 
tive constant. According to our definition of probability this means that the 
whole set of inequalities (16) will hold for all fundamental states of the 
energy level Æ with the exception of no more than c’V~“*” of the states. 
(Thus, the fraction of the fundamental states represented by the exceptional 
ones tends to zero with increasing N, E and V.) 


§8 


THEOREM 1. A function F(N1, Ne, ---, Ne, +++), which satisfies the 
extended Lipschitz condition, is normal. 
Let us put 


N,/N =L, (V/N)g9-T = Yr (r = 1,2,:--), 
so that the numbers ^, can be considered constants. Then the quantity 
<F> = F(NM, Nd, +++, Nd, eee) 


will depend on N (and hence on E and V) but not on the particular state 
of the system (and, in particular, it will not depend on the numbers N,). 
Since the function F by hypothesis satisfies the extended Lipschitz condi- 
tion, we have, by (3), for any a > 0, 


(7) |F(N,) — <F> | < C<F>¢(d,) D2 |G w |e", 


where g(A,) = ¢(A1, A2, -++ ) depends only on the numbers 4, and is thus 
constant. 
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Now we decompose the set of all Q(N, E) fundamental states of the sys- 
tem for the energy level E into two classes K, and K: , putting those states 
into class K, for which inequality (16) holds for some r. The remaining 
states are put into class K». We denote by Q (N, E) and 2(N, E) the 
numbers of states in these two classes, respectively. By our lemma, 


QN, E)/Q(N, E) < ey 50 (N> æ). 
On the other hand, for each state of class K;, we have 


Ih — | = N'IN, — VgT,| < ag h(TA/V) < cege OV, 
where c and cx are positive constants; whence 


| l, —», H < gfe PV, 


Thus, for any state of the class Ki, inequality (17) gives (if we remember 
that ¢(A,) is constant) 


| F(N,) = <F> | < oV “<F> ead gee, 


If we choose A < euß, the series on the right converges and its sum is some 
constant; therefore, 


(18) | E(N) — <F>|<aV"<F>. 


But this establishes the normality of the function F(N,) since inequality 
(18) holds for all states of class K,, which comprises the overwhelming ma- 
jority of the O(N, E) fundamental states with energy E. 

Moreover, we have obtained a simple method of finding the quantity 
<F> which is representative of the values of F in this majority of states. 
To obtain <F> we must substitute in the expression of the function F not 
the numbers N, but their microcanonical averages Vg,T, (r = 1, 2, +--). 
The number 


<F> = F(VqT1, Vg:T:, vee) 


will also be the value predicted by the statistical theory for the quantity 
expressed by the function F(N: , N2, ---) for given values of N, E and V. 


§9 


It is well-known that the microcanonical averaging of a quantity, for 
given values of N, E and V, refers to the simple arithmetic average of its 
values for the O(N, E) corresponding fundamental states. In other words, 
in microcanonical averaging, all accessible states have the same weight. If 
we normalize these weights so that their sum over all accessible states is 
unity, then we call the weights probabilities of the corresponding states. 
Thus, one can say that in microcanonical averaging, all accessible states 
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have the same probability 1/2. [Here, and in the following, we write & in- 
stead of Q(N, E) for brevity.] 

In statistical physics after the work of Gibbs another principle of averag- 
ing, called canonical averaging, became widely used. In canonical averaging, 
as distinct from microcanonical, all fundamental states consistent with the 
given values of N and V are assigned positive probabilities, regardless of 
the energy of the system. Of course, these probabilities can no longer all be 
equal to one another (since there are infinitely many such states). In canon- 
ical averaging each state with energy Æ is assigned the probability 

ce”, 
where £ is a positive constant and wherec is a coefficient chosen so that the 
sum of the probabilities of all fundamental states will be 1. 

Canonical averaging is very widely used, chiefly because it is much easier 
to calculate averages and their properties using the canonical distribution 
than it is to calculate using the microcanonical distribution. But, with re- 
spect to the theoretical principles of statistical physics, the canonical 
method of averaging at first sight always gives the impression of being 
rather arbitrary. Moreover, in the formation of mean values the canoni- 
cal method unnecessarily brings in states of the system which in general 
cannot exist for a given energy /, i.e., states with other values of the energy. 
(In the microcanonical method the probabilities of all such states is simply 
0.) Another disadvantage is that canonical averaging makes use of a very 
special distribution function which cannot be justified a priori except in 
that it is very convenient for computational purposes. (The microcanonical 
average, on the other hand, is based on the very simple distribution in which 
all accessible states are given the same probability. ) 

There are several different ways to overcome these difficulties. For ex- 
ample, one can show that in a very broad class of problems, including most 
of the ones frequently met in practice, canonical averages are practically 
identical with microcanonical averages. Hence, with complete justification 
we can consider canonical averages as convenient mathematical approxima- 
tions to microcanonical averages. (See Supplement III.) However, the dif- 
ficulties under discussion are usually approached from another point of view 
which is also completely satisfactory. This approach is based on a mathe- 
matical theorem proved by Gibbs, under special assumptions, and later 
generalized considerably by other authors. This theorem asserts that if a 
system is subject to a microcanonical distribution, then a small part of that 
system is necessarily subject to a canonical distribution. In particular, if we 
are dealing with a homogeneous system composed of N particles of identi- 
cal structure, then it follows from the microcanonical distribution of this 
system that the individual particles are distributed canonically. More ac- 
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curately, as N — œ, the probability that a particle will have the energy r 
approaches the quantity 


(V/N)g.T, = (V/N)gle**”" — of 


which is asymptotically proportional to g,e” for large r. (Since Vg, states 
of a particle correspond to the energy level r, it follows that the probability 
of finding a particle in any one of these states is asymptotically propor- 
tional to T,.) 

Thus, we no longer consider the original system to be isolated, but assume 
it to be in a state of free energetic contact with its very large surroundings 
(“heat bath”). We then take the “original system plus heat bath” as a 
new system, and consider the original system as a small part of this new 
system. We think of this new composite system as isolated, so that its energy 
is constant, and hence we assume that it is distributed microcanonically. 
The original system being just a small part will now be distributed canoni- 
cally in virtue of Gibbs’ theorem. 

With this approach to canonical averaging the appearance of the canoni- 
cal distribution does not represent an arbitrary assumption, but is a neces- 
sary consequence of the original hypothesis. The assumption that the total 
system is distributed microcanonically remains as the only hypothesis. We 
shall see that by using Theorem 1 (proved above) we can to a very great 
extent remove even this hypothetical element, and extend Gibbs’ theorem 
to the case of a system whose distribution is to a large degree arbitrary. 


§10 


A system of energy E and volume V consisting of N particles of identical 
structure is distributed microcanonically if each of the O(N, E) = Q ac- 
cessible fundamental states has the probability 1/2. Let us assume now that 
this probability can be different for different states, but that it is a sym- 
metric function of the particle energies. (In this case antisymmetry would 
be in violent contradiction to our basic assumption about the equivalence 
of the particles.) In other words, this probability is a function of the num- 


bers Ni, N2, +++, N,, +++ , and can be conveniently represented in the 
form 
(19) F(Ni, N2, t, Na, +++) /2. 


For the microcanonical distribution F = 1. 

THEOREM 2. If the probabilities of the different states of the system, con- 
sistent with the given values of N, V and E are defined by (19), where the func- 
tion F(Ni, N2,- , Nz, +++) is bounded and satisfies the extended Lip- 
schitz condition, then, as N -> œ, the probability that an individual particle 
will have energy r approaches 
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(V/N)gT, = (V/N)g let?” — oy". 


Let us denote by K the set of all choices of the numbers N, which satisfy 
the relations 


(K) DaN =N; DN, =E, 
and let us put 
F(VgTi, VgaTa, 0e , Vogels, e) = <F>. 
Let K, denote that part of K for which 
(20) |F — <F>| < &V “%<F>, 


and let K; denote the remaining part. As we showed in the proof of Theorem 
1, the number of states corresponding to K: does not exceed Q¢’/V'“*. Let 
us denote by A = A(N,, N2, ---,N;, +--+) the number of different states 
of the system corresponding to a given choice of the numbers N, . Then 


EN, = Q'E en N, F(N, No, e Ne, DANa, e, Nae, 


where the summation extends over all sets K of choices of the numbers 
N, . Whence, 


(21) EN, = <F>Q'P w NA + 0 Dey NAF — <F>]A 
+ 27 diy NAF — <F>]A. 
Since, for any of these choices of the numbers N, , we have 
Dore N, = N, 
summing (21) over all values of r yields 
(22) N= <F>N + N'È p F — <F>]A 
+ ND wp [F — <F>]A. 


But in K, inequality (20) is satisfied; therefore, remembering the bounded- 
ness of the function F, we obtain 
N ey [E — <F>JAl <a<F>NV™ < e N, 


On the other hand, paren A < Q¢e'/V*™, and consequently, if we denote 
the upper bound of the function F by M, we have 


IND ep [F — <F>]A!l < NYMR t / V“) < og N“. 
Thus, relation (22) gives 


ING — <F>)!| < e N= + e N“ = o(N), 
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and consequently </> — las N — «x. Since, on the other hand, the sum 
aP on N-A 


is obviously the microcanonical average E,A’, of the number N, , relation 
(21) gives 


|EN, — [1 + o(1)]EN, į < c N + e N", 

Using the well-known formula of quantum statistics 

EN, = Vg,1', + o(\), 
we find 

EN, = Vg,T, + o(N). 

But the probability that a particle have the energy 7 is obviously equal to 
Py(r) = N“EN,. 
Therefore, we find 
Py(r) = (V/N)g-T, + 0(1), 


and Theorem 2 is proved. 


Supplement VI 


SYMMETRIC FUNCTIONS ON MULTI- 
DIMENSIONAL SURFACES [14] 


§1. Introduction 


The general properties of symmetric functions on multi-dimensional sur- 
faces are very important for the mathematical foundation of the principles 
of classical statistical physics. If the surface is a “surface of constant en- 
ergy” in the phase space of a physical system, then every point of the sur- 
face describes a definite state of the system, consistent with the given total 
energy. In classical physics, any physical quantity associated with a system 
is uniquely determined by the state of that system and, hence, for a fixed 
total energy, is a function of position on the corresponding ‘‘energy shell”. 

Obviously, one can consider that every macroscopic quantity, which is 
expressed as a function of the Hamiltonian coordinates of the particles of 
the system, will (in the case of a homogeneous system) always be sym- 
metric with respect to these particles. (From the mathematical point of 
view, it would perhaps be expedient to define a macroscopic quantity di- 
rectly as any function of the particle coordinates which is symmetric with 
respect to these particles.) Therefore, from the physical point of view, the 
general properties of such symmetric functions will express the general 
properties of macroscopic quantities for a system with a fixed total energy. 

Among these properties, the ‘‘near-constancy”’ of such functions has the 
most important physical implications: When the system consists of a large 
number of particles, such a function, as a rule, assumes values which are 
extremely close to one another for the great majority of points on the en- 
ergy shell. It is clear that all these values will be very close to the micro- 
canonical average of the function over the given energy shell. This fact is 
especially important because it makes the introduction of an “ergodic” 
theorem or hypothesis unnecessary for the study of such systems. In fact, 
if for the great majority of points on the energy shell the values of the func- 
tion in question are very near to its average over the shell, then it is im- 
mediately obvious that the time averages along the majority of the system 
trajectories will practically coincide with this average over the shell. How- 
ever, establishing this nearness of “‘time-’”’ and “phase-” averages is pre- 
cisely the goal of any “ergodic” theory. 

Of course, this ‘‘near-constancy” of macroscopic quantities over an en- 
ergy shell was well-known to the founders of statistical physics. One finds 
clear hints of this property in Jeans, Lorentz and many other authors. It 
is rather surprising that, up to the present, apparently no one has attempted 
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either to prove this property of symmetric functions under some fairly gen- 
eral hypotheses or to give a clear definition of the corresponding mathemati- 
cal problem. In Fowler’s Statistical Mechanics [11] this ‘“near-constancy” 
is established for the special class of quantities which I have called sum 
functions. In my two books on statistical mechanics, I gave a proof for this 
same class of quantities by a new method, namely by reducing the prob- 
lem to one involving the local limit theorems of the theory of probability. 
Later (see Supplement V), I used this method to prove the “near-con- 
stancy” for very general symmetric functions in the case of the simplest 
type of quantum systems. The corresponding problem in classical physics 
is much more complicated. A quite general solution is given for the first 
time in this paper. The only condition necessary for the ‘‘near-constancy”’ 
of a symmetric function over an energy shell is that the function be con- 
tinuous, where continuity is understood in a certain very broad sense. 

The basic result mentioned above is contained in §9. The intervening 
sections are concerned partly with preliminary results and partly with di- 
gressions, in which certain problems are solved which have no direct con- 
nection with the main development of the paper. Thus, in §§5 and 7, we 
obtain an accurate solution to the problem of finding the limiting distribu- 
tions of the different terms of a sequence obtained by arranging the ener- 
gies of the particles comprising the system in order of increasing magni- 
tude. This problem, no doubt, is of interest both in multi-dimensional 
geometry and in physics. 

Finally, in §10, our basic result is applied to the generalization of a the- 
orem of Gibbs. This generalization seems quite significant to me. The dis- 
cussion concerns the theorem that a small part of a system, under very 
broad hypotheses, is distributed canonically in its own phase space when 
the system as a whole is microcanonically distributed on an energy shell. 
This theorem is usually (and justly) considered the most important the- 
oretical argument in favor of the widely used canonical method of averaging. 
The only remaining hypothesis is the assumption about the microcanonical 
distribution of the “large” system. In §10 I show that this hypothesis can 
be changed to a much broader one: It is sufficient to assume that the dis- 
tribution of the system over the energy shell is symmetric with respect to 
the particles (and that certain natural and general requirements of bounded- 
ness and continuity are satisfied). In fact, according to the results of §9, 
any continuous symmetric distribution of the system over the energy shell 
is “almost equal” to a microcanonical distribution. 


§2. Preliminary formulas 


In this section we collect several well-known formulas of classical statis- 
tical mechanics which will be used later. 
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Let us consider mechanical systems composed of a very large number N 
of particles of identical structure. We shall denote the energy of the system 
by E and the energy of the ith particle by e;. We shall assume that 


E = Dori is 


The set of Hamiltonian variables of the 7th particle will be called briefly 
(qi, pi). Any physical quantity associated with the ith particle is a func- 
tion of these variables. In particular, the energy e; equals e(q: , pi), where 
the form of the function e(q; , p:) does not depend on z. (This just expresses 
the fact that the particles have identical structure.) The Euclidean space 
whose Cartesian coordinates are the Hamiltonian variables of a particular 
particle is called the phase space of that particle. We shall denote by w(<) 
the volume of that part of this phase space in which e(q: , p;) < x. We as- 
sume that the zero value of the energy is chosen so that v(0) = 0. 

The function w(x) = v’(2), which plays a most important role in our 
work, will be called the structure function of the particle. The hypersurface 
e(q:, pi) = x of the phase space of the ith particle will be called the ‘“‘sur- 
face of constant energy x”. Denoting by do the “surface element” of this 
surface, we have the well-known result 


wlz) = Í de/grad elgi, Di), 


where the integration is carried out over the entire surface e(q:, pi) = x. 

We introduce analogous concepts for the system composed of N particles. 
The Cartesian coordinates of the phase space of this system are the Hamil- 
tonian variables (g; , p:) of all the particles (¢ = 1, --- , N). The volume 
of this space in which 


Delgi, pi) <a 
is denoted by Vy(x), and the function 
Qux) = Vw'(x) 


is called the structure function of the system. If d2y is the “surface ele- 
ment” of the energy shell 


(1) Dota elgi, pi) = 2, 
then 


Q(z) = | azn/grad E, 


where the integration is carried out over the entire shell (1) and where 
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E = Dhie(a, pi) 


is the total energy of the system. 

The functions Vy(z) and Qu(x) for the system are related to the cor- 
responding functions for one particle by simple composition rules. The re- 
lation 


W) N 
Vse) = f II {o(z:) aes) 


ptessfen<e i=) 


_ po fii late) ai) | m Cee 


i=l 


is almost obvious, and a differentiation with respect to x leads to the com- 
position rule for the structure functions 


Do wwf [TT ee aes Jo (2 - a), 


where the integral is extended over the entire (NV — 1)-dimensional space. 

Formally, this rule is completely analogous to the composition rule of 
probability densities for the sum of mutually independent random variables. 
This circumstance makes it possible to estimate the quantity Qy(x) by using 
the local limit theorems of the theory of probability, as we shall see in de- 
tail below. 

The physical state of a system is determined by the location of a point 
in its phase space. Every physical quantity which has a definite value in 
each state of the system is therefore some function f(P) of the point P in 
its phase space. The quantity <f>, defined by 


<S> = [0v(2)F" f (P) d2x/grad B, 


will be called the mean value of f(P) for a system with total energy x. The 
integration is extended over the constant energy surface E = x. This is 
usually called the microcanonical principle of averaging. In particular, this 
allows us to define the probabilities of various relations among the Hamil- 
tonian variables of the system. In fact, for any such relation A we can put 
f(P) = 1if relation A is satisfied at the point P, and f(P) = 0 otherwise. 
Then the probability of the event A (for E = zx) is defined as the micro- 
canonical average of f(P) over the surface E = x. 

The following relation ([1; p. 37]) is often useful for computing micro- 
canonical averages: 


(3) <S> = lasle)" (fae) f IP) N, 
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where dV stands for the volume element of the phase space of the system 
and the integral extends over that part of this space in which E < x. Let 
us consider, as an example, the important case in which f(P) depends only 
on the particle energies: 


f(P) = O(a, “++, én). 


Then we have 


Í KP) = | (e, +, en) dV 
E<r ejt’ ten<z 


(N) N: 

= ®(a, +++, zn) [I felz:) dzi) 
Z+ +y <T iml 
(N—1) [N=1 

= J | {w(z;) az) | 


a—(zy+++++zN_1) 
a (z, eae zw )u( zw) dzy . 
By (3), we then get 


(4) <f> = [Rr(2)] [7 a(a, “aya, — 3 z) 
E {w(z:) dz} | w (: — > z): 


This formula will be quite useful to us later. In the special case for which 
J(P) = (e) depends on the energy of only one (say the first) particle, (4) 
gives 


<f> = <ela)> = lasla)? |" ela)ola) da 
Ta pai {w(2;) dz) | w (2 — 2 2) ; 


According to (2), the inner integral on the right side equals Qy4(z — 2), 
and we find 


(5) <ole> = a) [T eoll = 2) de 


Another important application of formula (3) is to express the distribu- 
tion of an individual particle in its phase space. For definiteness, let us speak 
of the first particle. Let us denote its phase space by T; and let us determine 
the probability that. the Hamiltonian variables (q, , pı) of this particle be- 
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long to a certain region A of the space T; . This probability will be the micro- 
canonical average of the function f(P) (where P, as always, denotes a point 
of the surface E = x) defined in the following manner: f(P) = 1 if the point 
P is such that (qı, pı) € A, and f(P) = 0 otherwise. Clearly, since f(P) 
depends only on the Hamiltonian variables (q, , pı) of the first particle 


i f(P) dv = Í. JCP) dv EE av". 


E* and dV* denote, respectively, the energy and volume element of the 
phase space for the rest of the system (i.e., the set of all particles other 
than the first), and e: is the energy of the first particle at the given point 
of the space T; . Obviously, 


f dV* = Vya (at — 6), 
E*<r—e) 


so that 


[sey av = f KPW - a) au. 
E<z D: 
Hence, 
(d/dz) [1P dV = k f(P)Qna(a — e1) dv = IREE — e) dv, 
and consequently, 
<f> = Piq, p € A} = f [rile — e1)/9(2)] an. 


This means that if z is the total energy of the system, the particle we 
selected is distributed in its phase space T, with density 


(6) elg, Pi) = Qvale — @1)/Qy(2), 


where e, = e(q, , pı) is the energy of the particle at the given point. In par- 
ticular, we see that this distribution depends only on the energy e(q1, pi) 
of the particle, so that for all points of the space T; with the same value of 
the energy, the probability density will be the same. It is easy to show that 
(5) follows from (6) in a straightforward way. 


§3. Distribution of the energy of a particle. Gibbs’ theorem 


First, we consider the problem of the energy distribution of an individual 
particle. As in §2 we define the probability P(e, < u) that the energy of the 
first particle is less than the positive number u, to be the microcanonical 
average of f(P), where f(P) is equal to 1 if, at the point P of the surface 
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+++ + ey = E, we have eg = e(m,71) < u, and is equal to zero other- 
wise. Since f(P) depends only on e, we can put 
IP) es 1 (a <u) 
= Pu € = 
i 0 (4 >u) 


and, according to (5), we have 


P(e < u) 
(7) 


<eules)> = a f gu(2)o(2) Ova — 2) de 


Í EEE EE 


where E denotes the total energy of the system. We see that for an asymp- 
totic estimate of the quantity P(e, < u) we must find approximate expres- 
sions for the quantities Qy(#) and Qx—ı(E — 2). 

For this purpose let us use the method described in my book [1]. We as- 
sume that the number of particles N and the total energy E of the system 
are infinitely large, maintaining the constant ratio E/N = a. If we assume 
further that w(z) satisfies the usual conditions ([1; p. 76]), then a unique 
positive number a exists for which 


(8) Í TE J Í n a & 


Let us put 
f Adan oa = el). 
0 


The function (z) is the density of a certain distribution whose mathe- 
matical expectation, according to (8), is equal to a. Since w(z) = Me™ge(2), 
relation (2) gives 


Qn(a) = Ne“*by (x), 


where 
(9) af [TT eo aad Jo (2 - X a). 


Therefore, relation (7) can be rewritten in the form 


Pla <u) = [ Merola) tet! aby a(E — 2)/N yl E)] de 
(10) i 
= : [o(z)xa(E — z)/y(E)] dz. 
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But x(x), by its definition (9), is the density at the point x of the dis- 
tribution of the sum of N mutually independent terms with density ¢(z). 
According to the local limit theorem ((8; p. 228]), all of whose requirements 
are satisfied in our case, we have 


(11) @y[Na + 2(Nb)*] = Sy[E + 2(Nb)'] = (Q4Nb) te? + Ny (2), 


where b denotes the dispersion of the distribution (z), and where as 
N — œ, »(z) — O uniformly on the entire real line (— œ <z < œ). Thus, 


by(E) = (2rNb)* + o(N°), 
and for z = O(1) 
@na(E — z) = (2rNb)> + (N>). 
Therefore, for constant u 
(12) by(L — 2)/Sy(E) = 1 + o(1) 


uniformly in the interval 0 < z < u. From (10) we obtain as N > œ 


Ple <u) = | (2) de + o(1), 
0 
and consequently, 
Ple < u) > k elz) dz = A wlzje 7 az / | wlz) “dz (N> œ). 


Thus, the problem of determining the limit distribution of the energy of a 
particle is completely solved. Let us recall further that the number a is the 
unique root of equation (8). In the sequel we put 


Í Pena / i ive ae RGAN, 


so that for any u > 0 
P(e < u) > K(u) (N= œ). 


The corresponding local limit theorem is also an immediate consequence 
of our calculations. Differentiating (10) and using the estimate (12) we 
find 


dP(a < u)/du = plu) (E — u)/by(E) = (u) + o(1). 
Whence, as N > œ 


dP(e < u)/du > (u) = K' (u) (O<u<»), 
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Further, according to (11), we obviously have @y4(2) = O(N) uni- 
formly for —æ% < x < œ. According to the asymptotic estimate we have 
obtained for #x(E), there exists a constant c > 0 such that 


by(E — 2)/Oy(E) < c 


uniformly with respect to all values of N and z. Since (10) shows that 
Pler > u) = | (ol2)bya(B — z)/®x(E)] dz, 
we find for all N and for all u > 0 


(13) PG, Sai) e f el) edt RO. 


Let us assume now that the function w(z)e ” is decreasing (at least for 


sufficiently large z) for any y > 0. (This is always the case in real physical 
problems.) Then, for sufficiently large u and with 0 < 8 < æ 


1 — K(u) = a. e wz) dz = a f olz) dz 


< Nolu) eP" f edz = (0A) olu) < eB, 


and, consequently, for any N and sufficiently large u 
P(e > u) <e™. 
Thus we arrive at the following general estimate which we will need later: 
Lemma. Let 0 < B < a. Then for any N and sufficiently large u 
Ple 2 u) < ™. 
Formula (6), §2, for the density of the distribution of a particle in its 


phase space can also be approximated simply and conveniently with the 
help of these asymptotic formulas. First, we easily find 


Qva(E — e1)/w(E) = Ne by aE — e)/#x(E). 
Since y_1(E — e)/n(E) tends to unity for constant e, and N — œ, then 
as N>, 


lim [Rr (E — e:)/Qn(E)] = A7, 


Formula (6) shows therefore that when N and E approach infinity while 
maintaining a constant ratio, the density of the distribution of a particle 
in its phase space also tends fat. the point (qı , pı) of this space] to the quan- 
tity 
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ne 2 = e] wlzje 7 dz, 
0 


where e, = e(qı , pı) is the energy of the particle at the point (qı , pı). The 
probability that, for example, the first particle will belong to a certain do- 
main A of its phase space will thus tend to the quantity 


[eeto a/f g7% nP) dv, 
A TY i 


where dv; is the volume element of the space T, . This result, which is usually 
called Gibbs’ Theorem, can be interpreted as follows: If a large system is 
microcanonically distributed in its phase space, a small part of this system 
(a particle) is canonically distributed in its phase space. 


§4. Derivation of the fundamental formula 


We turn now to the solution of a more difficult problem. Let an arbitrary 
u = 0 be given. Then, in general, for an arbitrary point P of the surface 


Din e(g:, pi) = E, 


some particles will have energy less than u, and the others will have energy 
greater than or equal to u. 

Let us denote by yẹx(u) the relative number of those particles whose 
energies are less than u, so that y(u) is one of the numbers 


Obviously, yw(u), for given N and u, is a function of the point P of the 
constant energy surface and consequently, from our point of view, must be 
considered as a random variable. We shall now find the distribution of this 
random variable. The solution of this problem is the key to the study of sym- 
metric functions of the particle energies. In fact, at two points for which the 
values of yw(u) coincide for all u > 0, the energies of the particles obviously 
can differ from one another only in so far as the particles are enumerated 
in a different order. Conversely, this means that any symmetric function of 
the particle energies assumes identical values at two such points. In other 
words, any symmetric function of the particle energies can be considered as 
a functional whose argument is the function yy(u). 

Such a functional (if it is continuous in a certain definite sense) will be 
“almost constant” on a surface of constant energy, if the function py(w) 
possesses this property. (This point is discussed in greater detail in §9.) 
It is for precisely this reason that we must study the distribution law of the 
quantity pw(u). 

The event yy(u) = k/N (0 < k < N), whose probability we set out to 
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determine, is the event in which k of the N particles of the system have 
energy less than u, while the remainder have energy greater than or equal 
to u. 

Since k particles can be chosen in C(N, k) different ways, and since all 
the particles are indistinguishable from one another, 


Plew(u) = k/N} 
C(N,k)Ple <u(l<isk)jee2Du(k+1<i< N)} 
C(N, k)PiAr(u)}, 


where, for brevity, we have used the symbol Ax(u) to denote the event 
fe < u (1 <i <k);e:> u(k+1 <i <N). If we now put, as in §3, 


1 (0<2<u), 
0 (z<0,z >u), 


(14) 


pa(2) = | 


then the probability of the event 4,(u) is by definition the microcanonical 
average of the quantity 


Fi(u) = be eale) Ia TEE 


Therefore, if the total energy of the system is FE, the application of (4), §2 
yields 


(N—1) k 
PlAd(u)} = <Fu(w)> = E f [T eola) dz | 


i= 


| Tl {1 — ¢u(z:)]w(z;) az = e(z — Da) o(# - > zi) : 


t=k+1 


(15) 


This gives us a very inconvenient expression for the probability 
Plyw(u) = k/N}. 

Hence, we shall make a preliminary transformation completely analogous 
to (although somewhat more complicated than) the one carried out in §3. 
Let us put 

A= f wle) 7 dz, 

0 

where a is determined from the equation 


fe zw(ze az / [ wlzje 7 dz = E/N = a. 


Let us also put 
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u 
K(u) = xt] olz)” dz, 
0 


elda / f pulz)olz)e™ dz = AK (u) pulz)olz)™ = file), 
and 
[L = gallel) / f U = elole a 


= A7 — KU) — eu(z)lo(ze = folz), 


so that fı(2) and f:(2) are the densities of certain distributions. Thus, we 
have 


eu(Z)w(z) = AK(u)e“fi(z), 
[1 — gu(z)Jo(z) = M1 — K(u)]e“fe(z), 
and (15) takes the form 
P{Ai(u)} = (ETIK (u) — K(u) "e7 
(16) (N-1) k N-1 N—1 
J | Use da: | | I, falzi) da: | (z = 2 a). 


Here the (N — 1)-fold integral on the right side obviously represents the 
density, at the point E, of the distribution of the sum of N mutually inde- 
pendent random variables, of which the first k are distributed with density 
ji(z) and the remaining N — k are distributed with density fẹ(2). For brev- 
ity, let us denote this density by II,(u, #). Then (14) and (16) give 

Plyw(u) = k/N} 


17 
AR = C(N, k)N "e QE) IK O — K (u) (u, E). 


Finally, let us note that 11)(0, E) represents the density at the point E 
of the distribution of the sum of N mutually independent random variables 
distributed with density \'w(z)e*”. In §3 we denoted this quantity by 
(E) and we saw that 


(E) = dXe**y(E) = d%e**I,(0, E). 
Therefore, we find 
Plyw(u) = k/N} 
= C(N,k)[K(u)]{1 — K(u)]"“t(u, Z)/1(0, E). 


Formula (18) is basic to all our further calculations. We shall see that it 


(18) 
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is exceptionally convenient for this purpose. The study of (18) involves 
finding the asymptotic estimate of the density I,(u, E). This computation 
naturally involves the distributions f;(z) and f.(z). Therefore we shall de- 
scribe here some of the simplest properties of these distributions which will 
be needed later. 

Let M,” and M,” denote, respectively, the rth moments of the distribu- 
tions f,(z) and f2(z). Then 


K(u)M,° = K(u) f: zfilz) dz = X! J gulz)2w(z)e™ dz 


=)! f z'w(z)e 7 dz, 
0 


and analogously, 


i?) 


[l — Ku) M: =2r7 |] zw(z)e dz. 
Whence, 
K(u)M,® + (1 — Ku) M: = x7 | zu(zje ~ dz. 
0 


Thus, a linear combination of the rth moments of the distributions fı 
and fz with coefficients K (u) and 1 — K (u) is always equal (independently 
of u) to the moment of the same order of the distribution 


dN w(z)e” = (2) 


which we considered in §3. By the definition of the number a, the mathe- 
matical expectation of this law is equal to a = E/N. Its dispersion will be 
denoted by b. Thus, 


(19) K(u)M,” + (1 — K(u)|M,™ = a, 
and 
(20) K(u)M,® + (1 — KUM? =b + è. 


§5. Distribution of the maximum and minimum energy of a particle 


Before we proceed to the general asymptotic analysis of the basic for- 
mula (18), let us consider some of its more elementary applications. We 
shall show that with the help of this formula it is easy to find a very ac- 
curate expression for the distribution of the maximum and minimum energy 
of a particle (when the total energy of the system is fixed). |Here, and in the 
sequel, by maximum (minimum) energy of a particle is meant. that energy 
which is the maximum (minimum) possessed by any single particle for a 
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given state of the entire system. Since the division of the total energy among 
the separate particles depends on the state of the system, so does the maxi- 
mum (minimum) energy of a particle.] 

The requirement that the maximum energy of a particle be less than u 
is equivalent to the requirement that all N particles have energy less than 
u. This implies that yv(u) = 1. Thus the distribution of the maximum 
energy of a particle is 


G(u) = Pfyx(u) = 1}, 
which can be found from (18) by putting k = N. Hence, 
(21) G(u) = {K(u)} x(u, £)/To(0, E). 


We turn now to a detailed calculation whose result will express with 
great accuracy the limiting form of the distribution G(u): 
THEOREM. For constant z (—œ < z < œ), as N > œ, 


G(a ln N{1 + ln wla ln N)/ln N + 2/In N}) > exp (1a A'e). 


In particular, this theorem shows that for sufficiently large N the maximum 
energy of a particle, with a probability as near as we like to unity, will be 
approximately 


aln N +a`ln wola InN). 
Proof. For brevity let us put 
aln N{1 + In o(a ln N)/ln N + z/ln N} = uw. 
According to (21), the theorem will be proved if we show that, as N — œ, 


(*) {K(ux)}" — exp (~a Ne 
and 
(**) Ixy(uvn , E)/I(0, E) > 1. 


yz 


As in §3, we shall suppose here that the function w(z)e ” decreases for 
any y > 0 and for sufficiently large z. Moreover we shall make the further 
assumption that as z => œ 


(22) w'(z)/w(z) = o(2 7). 


[Both assumptions are always fulfilled in real physical problems where 
w(z) is usually approximated by some power of the variable z. It is easy, 
however, to show that our first assumption is a consequence of the second.] 

In order to prove relation (*), let us note that integration by parts gives 
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1 — K(u) = ey wlzje ” dz 


(va) olu)" + (ra)! ia w (ze dz. 


But it follows from (22) that, asz > ©, w’(z) = o[w(z)] and therefore as 
u— D 


ig w'(z)e dz = 0 if oe | ee 


and we find 

1 — K(u) = (Aa) w(uje™ + of1 — K(u)}. 
Hence, as u— 2 
(23) 1 — K(u) = (da) 'a(uye {1 + 0(1)}. 


Further, it follows from our first assumption that for any 6 > 0 and for 
sufficiently large z we have w(z) < e” and, hence, In w(z) < 6z. In other 
words we have In w(y) = o(y) and consequently, in particular, 


In w(@ ` ln N) = o(ln N). 


Whence, 
1 +lno(a ` ln N)/In N + z/n N =1 +01) (N>), 
and thus, 
ux =a InN + o(ln N). 
Consequently, 


wlux) = ofa lIn N + o(ln N)} 
= wla" ln N) + ofln No’ (a™ ln N)]. 
Therefore, by (22) as N > œ 
(24)  wlus) = ola ln N) + ofw(a@ In N)] ~ wla InN). 
On the other hand, we have 
(25) € = exp[—InN — ln o(a ln N)— q] = e [No(a InN)". 
From (23), (24) and (25) it follows that as N > œ 
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1 — K(uy) ~ &*[\aNT", 
(26) K(u,) = 1 — e MaN]! + 0(N), 
{K(uw)}* > exp (—e7/Aa), 


which proves relation (*). 
Turning now to the relation 


Tv(uw , #)/o(0, E) > 1, 


we notice first of all that the denominator of this ratio, by definition, coin- 
cides with the quantity #x( E), which we investigated in §3. We saw there 
that as N > œ 


(0, E) = (E) ~ (20Nb)™?, 
where b is the dispersion of the distribution whose density is 
elz) = Molz)”. 


Thus we have to investigate only the numerator Ix(ux , E). This quan- 
tity is the density, at point E, of the distribution of the sum of N mutually 
independent random variables, each of which is distributed with the den- 
sity (see §4) 

fil2) = puy(2)olz) FAK (uw) 


In §4 the mathematical expectation of this term was denoted by M,®. 
It equals 


uN 
(27) MP = [\K(uy)}" [ zolz)” dz, 
0 
and as N — œ it approaches 
(28) a> f wlz)” de = a = E/N. 
0 


Therefore, the mathematical expectation of the sum will be near Na = 
E, i.e., just that point at which we must estimate the density of the dis- 
tribution of this sum. In order to make this estimate we must establish the 
order of smallness of the difference M,® — a. According to (27) and (28), 
we have 


Ma — MP] = iz zolz) 7 dz — [K(uw)]~ i aw(z)e ~’ dz 


ay zula) de — [1 — K(un IK (uy) i: zw(z)e* dz. 


uN 
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By (26), the second term, as N — œ, has the form O(N‘). As for the 
first term, integration by parts gives 


wo 


| zw(z)e dz = a ‘uvw(uvje "Y" + a” | lolz) + zw (z)Jle dz. 


7 UN u 


It is obvious from (23), (26) and ux = O(ln N), that the first term of 
this formula has the form O(N” In N). Using (22), we see that the second 
term is of order O[1 — K(uy)] = O(N"). Thus, 


a — M,® = O(N InN), 
and it follows that 
(29) E — NM, = Nila — M,°] = O(n N). 
If, for brevity, the dispersion M,® — [M,"’) of the law f,(z) is denoted 
by by, then the local limit theorem [see (11), §3] yields 
Ty(uy , E) ~ (2nNby) exp (—[E — NM, }/2Nby) (N> œ). 


It is obvious that, as N — œ, the dispersion by of the law f(z) tends to 
the dispersion b of the law ¢(z). Since, by (29) 
[E — NM,°)/2Nby = O(N In’ N), 
we have 
Tw(uw , E) ~ (2eNby)? (N => œ) 


i.e., IIy coincides asymptotically with M(0, Æ). This proves relation (**) 
and simultaneously completes the proof of our theorem. 

Now we turn to the distribution of the minimum energy of a particle. 
This problem is solved analogously to the previous problem, but the cal- 
culation here is much simpler. First of all, it is obvious that the probability 
that the minimum energy of a particle is less than u is equal to 


glu) = 1 — P{yx(u) = 0}. 


In order to get an asymptotic estimate of g(u), we can again use our 
basic formula (18), this time putting k = 0, 


Plyw(u) = 0} = {1 — K(«)} "il(u, E)/IL(0, E). 


Let us denote by v(u) the volume of that part of the phase space of a 
particle in which its energy is less than u. Then, by the definition of the 
function w(z), 


vlu) = [ w(z) dz. 
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The function v(u) is continuous and increasing. Let us denote by u(v) 
the inverse function which, consequently, is also continuous and increasing 
for any v > 0. Then the following limit theorem solves our problem. 

THEOREM. For any constant x > 0, and as N > œ, 


glu(x/N)} > 1 — exp (—a/d). 


Proof. Obviously, the theorem will be proved if we can show that, as 
N> © 


(*) [1 — K{u(x/N)}]" — exp (—2/)) 
and 
(**) Thh{u(r/N), E} /(0, E) > 1. 


We note that, as u —> 0 


K(u) = 7” . a(zje dz = X! [ w(z){l + O(z)] dz 


Avlu) + O(u)). 
In particular, as N — œ 
K{u(2/N)} = N'(2/N)[1 + 0(1)]. 


From this we easily establish relation (*). 

In order to prove relation (**), we note first of all that the denominator 
1I,(0, E) is exactly the same as in the preceding theorem. We know that 
as N — o it is equivalent to (2nNb) +. Thus, again, we have to estimate 
only the numerator IIp{u(z/N), E}. We shall not carry out the estimate 
explicitly as the calculation is very similar to that of the preceding theorem. 
The only difference is that this calculation is much easier. In this case the 
basic problem is to estimate the difference M,” — a. We easily find that 
as N > œ 


M:® — a = O(N). 


The application of the local limit theorem now shows, exactly as in the 
preceding theorem, that as N —> œ 


Mfulz/N), E} ~ (2r Nb)>. 
This proves relation (**) and hence the entire theorem. 


§6. The basic limit theorem 


Let us turn now to the general asymptotic analysis of (18), §4, which 
will be used to establish the most important limit theorem of the theory. 
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First we make a few preliminary remarks. The quantity Nyx(u) denotes 
the number of particles whose energies are less than u. Formula (18) gives 
the probability that this number is equal to k. According to the results of 
§3, for large N the probability that the energy of an individual particle is 
less than u is near K(u). If this probability were exactly K(u) and if the 
energies of the different particles, as random variables, were mutually inde- 
pendent, then we would be dealing with a simple case of Bernoulli trials: 
Nww(u) would simply be the number of successes in N independent trials 
of a certain event (e; < u) whose probability for each individual trial is 
K(u). The probability that this number of successes equal k, would be 


C(N, RAKD — Ku), 


which is just the first factor of the right side of (18). However, the particle 
energies are dependent since their sum must equal the total energy = Na. 
The second factor 


Th. (u, FE) /11(0, E) 


of the right side of (18) is a “correction factor”? which takes account of 
the mutual dependence of the energies of the particles. Since this depend- 
ence is very weak (we have a large number N of random variables e; re- 
lated by the one equation È`; e: = Na), we can expect that the correction 
factor in many cases will be nearly unity. For example, in both problems 
of §5 we saw that it tends to unity under appropriate conditions. (This is 
precisely why the problems of §5 were solvable by relatively elementary 
methods. ) 

We now perform the detailed calculation. In this section we consider the 
number u to be constant and therefore, for brevity, we write K instead of 
K(u). Otherwise we retain the notation of the preceding section. 

The mathematical expectation of the quantity yv(u) is obviously equal 
to P(e; < u), which (see §3) is nearly equal to K. We shall consider at 
first only those values of k for which k/N is nearly equal to this mathe- 
matical expectation and thus nearly equal to K. Let us put 


(k/N) — K =h. 
We shall assume in the following that as N — œ 
h = O(N). 


By using the (local) Laplace-de Moivre theorem, we find for the first fac- 
tors of the right side of (18) the asymptotic expression 


C(N,k)K*(1 — K)*™* 


30 : : 
Or a (1 + o(1)[2eNK(1 — K)J? exp [—Nh?/2K(1 — K)]. 
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Consider now the last (correction) factor. Its denominator, which is in- 
dependent of k and u, is equal to 


(31) T1o(0, E) = [L + o(1))(27Nb)*, 
where b denotes the dispersion of the distribution whose density is 
elz) = Nlw(z)e™. 


Thus, for the asymptotic analysis of (18) for values of k/N near K, we 
have to investigate only the behavior of the numerator II,(u, E) of the 
correction factor. Let us recall that II,(u, Æ) is the density, at the point 
E = Na, of the distribution of the sum sy of N mutually independent 
random variables, of which the first k are distributed according to the law 
fi(z), and the remaining N — k are distributed according to the law f(z) 
(see §4). The mathematical expectations of these two laws were denoted in 
§4 by M,® and M,"”. For brevity we shall denote them here by M, and 
Me. 

The second moments of these distributions will be denoted by M,” and 
M., and their dispersions by D, and D: , so that 


D,=M,? — Më, D= M® — Mè. 


Denoting the mathematical expectation and the dispersion of the sum sy 
by <sy> and Dsy , respectively, and using formulas (19) and (20) of $4, 
we have 


<sy> =kM, + (N —k)M, = N[(K +h)M, + (1 — K — A)M,] 
N[KM, + (1 — K)M. + ACM, — M2)] = N(a + hA), 
where A = M, — M2. Further, for h = O(N) 

Dsy = kD, + (N — k)D: 

N((K + h)(M,® — MY) + (1 — K — h)\(M:® — M?)) 
NKM: ® + (1 — K)M.” — KM? — (1 — K)M? + O(N™)] 
= N[b + œ — KM? — (1 — K)M? + O(N>°)] = ND. + O(N’), 


where 


Dı = b + & — KM; — (1 — K)M?. 


Hence, Do is constant for constant u. 

To get an estimate of the quantity I,(u, E) we can now apply the local 
limit theorem since all of the conditions are satisfied (see [8; p. 228]). We 
find that 
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T.(u, E) = [1 + 0(1)](2eDsy)? exp (—[E — <sr>}/2Dsx) 
= [1 + 0(1)](24NDo) exp (—Nh’d’/2D,). 
Using the estimates (30), (31) and (32), we can put (18) in the form 
Plvw(u) = k/N} 
= [1 + 0(1)]b'[2nNK(1 — K) DoJ? exp [—NW{1/2K(1 — K) + °/2Do}}], 
where h = (k/N) — K = O(N“). Let us put 
K(1 — K)D)/b = D. 


(32) 


Then, under the same assumption, we obtain 
Plyw(u) = k/N} 
= [1 + 0(1)](2eND)* exp [-Nk {Do + K(1 — K)A°}/2bD). 
Finally, it follows from (19) of §4 that 
Ds + K( — KK)’ =b+a@ — KM? — (1 — K)M? 
+ K(1 — K)(M? + M? — 2M\M,) 
= b + @ — [KM, + (1 — K)M,J 
=b. 
Consequently, the dispersion of yyx(u) is DN—, and we have 
Plyw(u) = k/N} 
[1 + 0(1)](2nWD)* exp (—Nh’/2D), 
fh = (k/N) — K = O(N*)], 


(33) 


This relation gives us the desired local limit theorem. Since the estimate 
given by (33) obviously holds uniformly for A/N? < h < B/N}, where 
A and B (A < B) are any constants, we can use the classical method to 
find the corresponding integral form: 

THEOREM. Let A and B (A < B) be any constants. Then for constant u > 0 
and N > œ 

B/D} 


P{AN" < pylu) — Klu) < BN”) > 7> | nen 


A/(2D)4 


where D depends on u, but not on N. 
Using simple transformations we easily obtain the following more con- 
venient form for D: 


(34) D = K(1 — K){1 — #K(1 — K)/b}. 
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We can infer from the beginning of this section that if the energies of the 
particles could be considered mutually independent, then the dispersion of 
¥v(u) would asymptotically equal K(1 — K)N—. Formula (34) shows, 
therefore, that taking account of the mutual dependence of the energies of the 
various particles always decreases the dispersion of yy(u). This could have 
been foreseen since the relation J; e; = E, which relates the energies of 
the particles, obviously leads to a negative mutual correlation between the 
energies of any pair of particles. 

As in the case of the classical Laplace-de Moivre theorem, it follows 
immediately from the proof of our theorem that the limit relation holds 
uniformly with respect to A and B (—œ <A < B < œ). In particular, 
as N — œ we have 


œ 


P{yx(u) — K(u) > AN?) > w? ioe e* dz, 


(35) 


B/(2D) 
Pldw(u) — K(u) < BN} et fe a 
Finally, as the Bernoulli theorem (the law of large numbers) follows 
directly from the Laplace-de Moivre theorem, so our theorem obviously 
has the following corollary. 
COROLLARY. For any «e > 0, no matter how small, and for any constant u 
and as N — œ 


(36) Pi |¥w(u) — K(u)| > § > 0. 


§7. Probability that the energy of a particle lies in a given interval 


Previously the particle energies were denoted by e1, «+: , €v and were 
arranged in an arbitrary order. Now we shall suppose that these energies 
are arranged in order of increasing magnitude, so that, for example, e 
stands for the minimum, and ey for the maximum energy of a particle. 

In §5 we studied the distributions of e, and ew . Using the basic limit 
theorem proved in §6 we can study, to the same accuracy, the probabilities 
that particle energies lie in given intervals. We shall devote the present 
section to this study. 

Let us denote by u, the positive number defined by the equation 


K(us) = k/N 
and let us put 
u, + AN? = w, u: + BN = u”, 


where A and B (A < B) are arbitrary constants. Then, obviously 
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Plu’ < es < u”} = Plyw(u’) < k/N < yulu”) 
(37) 1 — Pfyx(u’) > k/N} — P{yx(u”) < k/N} 
1 — Plyy(u’) > K(u)} — Plyw(u”) < K(u)}. 


In order to estimate the probabilities which appear on the right side of 
this equation, let us note that 


K(u’) = K(ux) + (u — u)K'(ue) + Oflu — ux)?} 
= K(m) + ANĊK' (u) + O(N”). 


Whence, 
Klu) = K(u’) — AN?K' (u) + O(N). 
Analogously, 
K(u.) = K(u”) — BNĊK' (u) + O(N). 
Therefore, we have 
Plyw(u’') > K(ue)} = Plow(u’) — K(w’) > —NVAK'(u) + O(N) 
Plyw(u”) < K(ux)} = Plyw(w”) — Klu”) < -NBK (w) + O(N)} 
According to the limit relations (35), §6, we easily find, as N > œ 


Plys’) > Klu) — tf Pa ary 


—AK' (u) (2D) } 


—BK' (ug) (2D)7} 


P{yw(u") < Klu)} — rif edz 0, 
where D depends, of course, on u, (and thus on k). The character of this 
dependence was established in §6. Comparing these relations with (37), we 
easily find 


BK'(uz)(2D)74 
_22 


Plu, + AN> < ep < ur + BN>} - rif e* dz— 0, 
AK' (up) (2D)T$ 
(N> =). 
We see that for large values of N, the energy es is approximately dis- 
tributed according to a normal law with mean u, and dispersion 
D(ux)/N{K' (ux). 


More definite conclusions can, of course, be obtained only by making some 
special assumptions about the growth of the index k as N — œ. The sim- 
plest and most natural assumption is that the ratio k/N tends to a certain 
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definite limit A (0 < A < 1), as N — œ. Then, determining the number 
uo from the equation 


K (uo) = À, 


we can assert that as N > œ, the distribution of the quantity N’ (er — u) 
tends to a normal law with mean 0 and with constant dispersion 


D(uo)/[K' (u). 


§8. A functional limit theorem 


In §6 we established that for any constant value of u (0 < u < œ), 
the random variable yw(u) tends in probability to K(u) as N —> œ. This 
is just the content of the corollary of the basic limit theorem we in- 
troduced at the end of §6. Now our problem is to establish the fact that 
the function pr(u) on its entire range (i.e., for O < u < œ), for large N 
and with very high probability, is arbitrarily close to the function K(u). In 
order to impart to this assertion an accurate meaning, it is of course neces- 
sary to define accurately the mutual distance of the functions yx(u) and 
K(u). Since both functions are distributions on the half-line (0, œ ), it is 
natural to look for a convenient general definition of distance between two 
such distributions. Let F,(u) and F(u) be two distributions on the half- 
line (0, œ). The simplest definition of the distance between them is the 
upper bound 


SUpo<u<o| Fy(u) — Fe(u) |, 


and it would be easy to show that this distance between the functions yx (u) 
and K(u) actually tends to 0 in probability as N — œ. However, these 
two functions are actually much closer; and since this fact is important 
for our calculations, it will be more advantageous to introduce a somewhat 
different definition of the distance between two distributions. Let us put 


pa(Fı , Fe) = supocucel | Fi(u) — Fe(u) | ey, 


where 8 is a positive constant. This definition will give us, of course, a 
certain one-parameter family of distances. Two functions, for which 
pe(Fı , F2) is small, must be very near to one another for large values of u 
since for ps(F',, F2) < «e we have 


| Filu) — Falu) | < ee. 


We can now prove that if 8 is sufficiently small, then for large N the 
distance p(w , K) becomes arbitrarily small with a very high probability, 
and consequently the functions ya(u) and K(u) become extremely close 
to one another. In particular, we can prove the following proposition: 


222 SYMMETRIC FUNCTIONS ON MULTI-DIMENSIONAL SURFACES (SUPP. VI 


THEOREM. For any e > QandB(0 < B <a), 
Plos(yw, K) > ef > 0 (N > œ). 


First we must prove the following lemma. 
Lemna. Let e > Oandß < a. Then as u — œ, we have 


P{supy>u [(1 FT. yn(v) e] > e} - 0 


uniformly with respect to N. 
Proof of the lemma. Let 8 < y < a. Using the lemma of §3, we have 


Pie > ul <e” 


for any N and for sufficiently large u. Therefore, the series (the order of 
particle enumeration is again arbitrary ) 


Dita P(e > k) 
converges. Since 
P(e > u) = 1 — Efpw(u)}, 
the series 


Doro e” — Efyw(k))] 


also converges. 
Let e * = e and, fork = 0, 1, 2, --- , let 


P{l — yxk) > de”) = p. 
Since the Chebyshev inequality yields 

Pe < (1/€) l — Efyw(k)}], 
the series > fo px also converges. From this we conclude that 
(38) P{suprar[(1 — ¥w(k))e"] > e} < Eip 0 (r> >). 
If for any k > 0 

[1 — yo(k)]e™ < e, 

then fork <u<k +1 

¥w(u) > ow(k) > 1 ee”, 
and, thus, 


1 — yru) <e ™ < fe MUM < ee 
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Hence it follows from (38) that 
P{sup.s,[(1 — yw(v))e”] > 0 (r= œ), 


uniformly with respect to N. This proves the lemma. 
Proof of the theorem. Let A > 0 be so large that 


(39) P{sup.>4[(1 — yw(v))e"] >  < €/2 


for any N. This is possible in view of the above lemma. In proving the 
lemma of §3, we saw that for 8 < y < a and for sufficiently large v 


1—K(v) <e” < e”. 
Therefore, for sufficiently large v it follows from 
1 — Yulo) < e” 
that 
|wx(v) — Ko) | < e”. 
From (39) it further follows that for sufficiently large A 
(40) Pfsup>a[| ¥w(v) — K(v) |] > 4 < €/2. 


Now let s be an integer greater than 2e°4/. For any integer r (0 < r < s) 
let us define the positive number u, by the equation 


K(u,) = (r/s)K(A), 
so that 
0 = uo < u < +: < u = A. 


According to the corollary (36) of the basic limit theorem of §6, we have 
for sufficiently large N 


P{ | yulu) — K(ur) | > (€/2)e*4} < €/2s (r=1, s) 


l 
2 


and, consequently [considering the fact that Yx(0) = K(0) 
(41) P{suposess | Ww(ur) — K(ur) | > (€/2)€ 4} < €/2. 
However, if 0 < r < s, and if both inequalities 
| ynu) ~ K(ur) | < (€/2)e, 
| Yuli) — Klun) | < (6/2) 
hold, then for u, < u < wrt, 
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K(u) — K(A)/s — (€/2)e** < Klum) — K(A)/s — (€/2)6** 
K(ur) — (€/2)e** < yulu) < yulu) 
Ynt) < Klun) + (6/2) 

= K(u,) + K(A)/s + (€/2)e™ 

< K(u) + K(A)/s + (¢/2)e**. 


I 


lA 


Consequently, 
| ¥w(u) — K(u) | < («/2)e™ + 1/s < ®t < ce ™. 
Therefore, it follows from (41) that 
P{sup,<a[|¥w(v) — K(v) fe] >  < ¢/2. 


Comparing this inequality with inequality (40), we see that for sufficiently 
large N 


Ples(¥w , K) > = Plsupocecel | ¥w(v) — K(v) |e") > e} <e. 


This proves our theorem. In this way we show that for sufficiently large N 
the function yx(u) (the assignment of which is equivalent to the assign- 
ment of the whole set of particle energies ¢, -++ , ey without regard to 
order) is over its entire range extremely close to the function K(u) for the 
overwhelming majority of points of the given constant energy surface. 
The form of K(u) depends only on the structure of the particles composing 
the system (and in particular does not depend on N). 


§9. Continuous symmetric functions 


Now we turn to the basic problem of our study — the behavior of sym- 
metric functions of the particle energies on the constant energy surfaces of 
the system. We have in mind those functions f(e, , ++- , ex) of the particle 
energies which do not depend on the order of the e, ++- , ex , i.e., those 
functions which are unchanged by any permutation of the arguments. 

It is obvious that the value of such a function at any point on the con- 
stant energy surface e + --- + ey = E is completely determined by the 
value, at this point, of the function ya(u), whose properties we developed in 
the last section. Thus, we can consider a symmetric function f(e, +--+ , ev) 
as a functional whose argument is the distribution of yx(u). Such a point 
of view is especially convenient for us. In fact our goal is the study of the 
behavior of f(e;, +-+ , ev) as N — œ (under the condition that e, + --+ + 
ev = Na). Thus, we are actually dealing not with one function f, but with 
a whole family of such functions (one for each N ). It is therefore first neces- 
sary to define accurately the meaning of f(a, -++ , ex) for any N. It is 
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simplest, of course, to do this by considering f as a functional with argu- 
ment ~y(u), and defining this functional directly for any discrete distribu- 
tion yy(u) with any finite number N of possible values, if all these values 
have the same probability 1/N and if the arithmetical average of these 
values is equal to the constant a. Actually, as we shall see, it will be con- 
venient (and to a certain degree necessary) to give to such a functional a 
much broader definition which includes the set of all distributions on the 
half-line (0, ©). So, for example, if 


f(a, te, en) z (1/N)(e* + sane + ex"), 


where k is a positive constant, we have 

fla,-+:,e”) = [ u“ dpn(u), 
and there is no reason why we cannot consider the functional 
(42) | uk dy(u) 


to be defined for any distribution y(u) for which this integral converges. 

Of course, this whole method is possible only if f(e:,--- , ev) is symmet- 
ric for any N. 

Hence, we shall consider functionals F[y] defined at least for all distribu- 
tions of the form ya(u). We shall call such a functional (and also the cor- 
responding symmetric function of the particle energies) continuous, if the 
difference F|] — F[y] becomes infinitely small when the distance p(¢, Y) 
between the distributions ¢ and y tends to zero. It is of course clear that 
this definition of continuity of the functional depends critically on the way 
in which we define p(y, y). If we take, in particular, the definition 

pal, Y) = SUPocuceel |e(u) — y(u) | e 

which we used in §8, then any functional F[y], which is continuous for some 
value of the parameter £, will obviously be continuous for all greater values 
of 8. It is easy to compute that for any k > 0 the linear functional (42) 
is continuous for £ > 0 and discontinuous at 6 = 0. It is just this fact that 
the majority of the simplest and most useful functionals are discontinuous 
at 8 = 0 (i.e., when the distance between the distributions ¢ and y is de- 
fined simply as the upper bound of the function | (u) — y(u) |), which 
forces us to introduce the distance pg. This definition is very convenient 
for the development of the theory and, moreover, broadens considerably 
the class of continuous functionals. 

Our goal is to prove that all functionals F[y], continuous for some (not 
very large) value of 8, possess the property of ‘‘near-constancy’’ on con- 
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stant energy surfaces. This means that at the overwhelming majority of 
points of the surface 

Dim e(qi, pi) = Na, 
we must show that the functional F[yx(u)] assumes values arbitrarily 
close (if N is sufficiently large) to a certain constant (which is independent 
of N and the choice of point). 

Using the theorem of §8, we see that this number will be F[K], i.e., the 
value assumed by F[y] for y = K(u). Of course, F[y] must be defined and 
continuous at y = K(u). Actually, it is an immediate consequence of the 
theorem of §8, and our definition of continuity, that the following theorem 
is true. 

THEOREM. Let the functional Fy] be defined and continuous for y = K(u) 
land, of course, for all Y = Wx(u)] for some B < a. Then, for any «e > 0, 
no matter how small, as N — x 


P{ | Fly») — F[K]| > «| —0. 


This theorem establishes the ‘‘near-constancy” of continuous symmetric 
functions of the particle energies on multi-dimensional surfaces of constant 
energy. 


§10. Generalizations of Gibbs’ theorem 


At the end of §3 we proved Gibbs’ theorem. This states that the micro- 
canonical distribution of a system on the constant energy surface 


Dim elgi, pi) = E = Na 


(which we now denote by Sw) implies, in the limit N — œ, the so-called 
canonical distribution for an individual particle in its phase space T, . This 
theorem is of fundamental importance since it gives a theoretical founda- 
tion to the widely used and very convenient canonical method of averaging 
for a system in contact with a heat bath (i.e., freely exchanging energy 
with practically infinitely large surroundings). This theorem provides the 
method of canonical averaging with a rigorous justification. Thus, canonical 
averaging is no longer just a convenient analytical device whose validity 
is established only empirically. 

In this theory, the microcanonical distribution assumption for the “large” 
system remains as a hypothetical and incompletely justified feature. Using 
the results of the last section, we can, to a certain degree, free ourselves 
from this assumption. Instead of the microcanonical distribution hypothesis 
we can limit ourselves to a much more general assumption, which requires 
only that the distribution of the system be symmetric with respect to its 
constituent particles. (Of course, we must also assume that the distribu- 
tions are, in a very broad sense, continuous and bounded. ) 
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To establish this result, we retain all the notation of the preceding sec- 
tion. Let us say that the functional F[y], which we considered in §9, is of 
positive type, if F[y] > 0 is always true and if F[K] > 0. We shall call F[y) 
bounded, if there exists a number Q > 0, independent of N, such that the 
microcanonical average of the quantity F’[yw] on the surface Sy satisfies 


(43) lox( E)” [ (F'lyyl/grad E) dBw < Q, 


where dZy is the “surface” element of the surface Sy , and Qy(/) is the 
“area” of the total surface. These properties are possessed by the functionals 
most frequently met in applications, and, in particular, by all functionals 
of the form (42). 

The microcanonical distribution of the system on the surface Sy is com- 
pletely characterized by noting that its density is proportional to 1/grad Æ. 
We now consider a broad class of distributions which are charactcrized by 
densities proportional to the expression 


Flyw)/grad E, 
where F[y] is any positive bounded functional, continuous for some 6 < a. 
(We have a microcanonical distribution if Fly] = 1.) 


Each distribution of the system on the surface Sy implies a certain com- 
pletely defined distribution for each particle in its own phase space. For 
example, the distribution of the first particle can be described by giving the 
probability P(A, € A) that the point A; of the phase space T , representing 
the state of the first particle, belongs to some domain A of this space. When 
the system is microcanonically distributed on the surface Sy , Gibbs’ the- 
orem tells us that as N —> œ 


P{A, € A} = f ePi) av / [ ett Pa) dvi, 
Ja r, 


(see the end of §3). We now prove that this limit relation holds for the 
much broader assumptions which we have made. In particular, we prove 
the following theorem: 

THEOREM. Let F[y] be a positive bounded functional, continuous for some 
B < a, and defined for all Wn(u) and for K(u). If the system is distributed 
on the surface Sy with a density proportional to the quantity 


Flyw)/grad E, 


then in the limit an individual particle will be distributed canonically, i.e., 
for any domain A of the space T, , as N > œ% 


PHA © A} =f eer te | gen) dy, 
a r; 
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Proof. Let us first introduce certain definitions. Let Sy(A) denote the 
set of all points of the surface Sy for which A; € A (i.e., the set of all Hamil- 
tonian variables for which the state of the first particle defines a point A, 
of the space T, belonging to the domain A of T1). The probability P(A; € A) 
will be denoted by Pm( A, € A) if the distribution of the system on the sur- 
face Sy is microcanonical, and by P,(.4; € A) if the distribution satisfies 
the conditions of our theorem. We obviously have 


P(A, € A} = UN ai e/i d&y/grad z} 


(44a) 
= [(E)* |  dZx/grad E, 
Sy (4) 


(44b) Pfc A} = f. Pld) dëx/grod r) / { f, Fisl d2x/erad rh. 
For brevity, let us put 
l f. Five) d2x/grad e} =w, FIK] = F, 


so that F* > 0 as in the hypotheses of the theorem. 
We have, by (44b), 


P.fAi € A} = dv f a PWr] d2x/grad E 
(45) i 
= rk f d2y/grad B+ dw [ (F — F*) d&y/grad E. 
Sy (A) Sy(A) 

Let Sy’ and Sy” denote, respectively, the parts of the surface Sy at which 
| Flv] — F* | < eand | Fly] — F*| > e where eis an arbitrary positive 
constant less than F*/2. In an analogous way we define the parts Sy’(A) 
and Sy”’(A) in the domain Sy(A). Let us also put 


, d2n/grad E = Qx (E), f dÈx/grad E = Qy” (E). 
SN SN” 


Using the theorem of §9, we have 
(46) Paf | Flyw] — F*| > à = Qx” (E)/Q(E)—>0 (N> œ). 
Further, it is obvious that 


Aw f (F — F*)dĒyx/grad z| 
Sy’ (A) 





< an f d2n/grad E = &w%y(E). 
SN 
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On the other hand, since e < F*/2, 

WS f , Fly) dEx/grad E = F*Qy'(E) + f (F — F*) d2y/grad E 
SN Sy’ 


> F*Q,'(E) — By (E) > 4F*Qy (E). 

But by (46) 

Qy'(E)/Qn(E) > 4, 
for sufficiently large N. Therefore, 

Ay > LF *Qy(E). 
Whence, 
(48) AvQy(E) < 4/F*. 
In this case, it follows from (47) that 


(49) < 4e/F*. 








vol (E — F*) d3y/grad E 
Sy‘ (A) 
Further, for any A and sufficiently large N, (48) and (46) give 


iy F* ddn/grad E < ief d?x/grad E 
(50) Sy” (A) Sy” 


= AsF*Qy (E) = AwF*Qy(E) (Qn (E) /Qn(E)| < é. 
Finally, 


dw Flv] dZy/grad E < dy f Fly] d2y/grad E 
) Sy” 


Sy” (A 


a l F dzy/grad E + rw [ F d2y/grad E 
Sy” SN” 


(F<1/e) (F>1/e) 


< (w/e) Qn"(E) + dw J _ F d2x/grad E 
. 


D 


< (rw/e)Qv(E)[Qn"(E)/Qn(E)] + ey [ F° dSy/grad E. 


If N is so large that 
Qy"(E)/Qn(E) < ë, 
then by (48) 


(51) awf ” 


SN 


Flyn] d2y/grad E < 4e/F* + 4Q/F* = (Q + 1)4e/F*. 
) 
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We now collect our results. From (50) and (51) we have 


(52) < «{4(Q + 1)/F* + 1}. 





ve) (F — F*) ddy/grad E 
SN” (A) 





But by (49) and (52), it follows from (45) that 


(53) 





P.(A, € A) — AwF* [ a Tr/erad E 
N 





< Be, 
where B is a positive constant. This inequality is fulfilled for sufficiently 
large N no matter what the domain A of the space T,. For A = T), 
P(4 € A) = 1, 
and we find 
| 1 — AwF*OQn(E) | < Be. 
Since «is as small as we like, 
AwQn(E) — 1/F* (N > œ), 
or 
AwF* = (1 + ay)/Qn(E), av 0 (N > œ). 
Turning to (53) and using (44a), we therefore find 
| P.(Ar € A) — (1 + aw) Pm(Ar € A) | < Be. 
This means that as N — © 
P,(Ai € A) — P,z(Ai € A) > 0, 


and consequently, according to Gibbs’ theorem, 
P,( Ay € A) ->f eP av. / | e warm) av, , 
4 D 


which was to be proved. 
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